/
Speaking while monitoring addressees for understanding Speaking while monitoring addressees for understanding

Speaking while monitoring addressees for understanding - PowerPoint Presentation

pamella-moone
pamella-moone . @pamella-moone
Follow
431 views
Uploaded On 2015-11-22

Speaking while monitoring addressees for understanding - PPT Presentation

Torsten Jachmann 16122013 Herbert H Clark and Meredyth A Krych Seminar Gaze as function of instructions and vice versa Research Question Speaking and listening in dialog ID: 201782

eye gaze data timing gaze eye timing data speaker workspace monitoring cross turn instructions gestures utterances method level time

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Speaking while monitoring addressees for..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Speaking while monitoring addressees for understanding

Torsten Jachmann16.12.2013

Herbert H. Clark and Meredyth A. Krych

Seminar „Gaze

as

function

of

instructions

-

and

vice

versa

“Slide2

Research Question

Speaking and listening in dialogUnilateralSpeakers and listeners act autonomousNo interactionBilateralSpeakers and listeners monitor their respective partnerJoint activityWhat do speakers monitor?How do they use that information?Slide3

Grounding

Level 1Attend to vocalizationLevel 2Identify words, phrases and sentencesLevel 3Understand the meaningLevel 4Consider answeringSlide4

Grounding

A: Where you there when they erected the new signs?B: Th… which new signs? (Level 3)A: Little notice boards, indicating where you had to go for everythingB: No. Bilateral accountSlide5

Monitoring

VoicesAttendance to partners utterancesFacesGaze and facial expressions as indicator for understandingWorkspacesRegion in front of the bodyManual gestures (but also games, etc.)Slide6

Monitoring

BodiesHead and torso movement as indicatorShared Scenes Scenery beyond workspaceSignals vs. SymptomsSignals are constructed to get meaning acrossSymptoms are not intentionally createdSlide7

Least joint effort

OpportunisticSelection of the available methods that take the least effort to produce“Tailored”Overhearers (not monitored by speaker) may misunderstand utterancesSlide8

Method

Pairs of directors and builders76 students (34 male / 42 female)Instructions to build 10 simple Lego Models2 x 2 design (interactive)28 pairsAdditional non-interactive condition10 pairsVideo and audio analysesSlide9

Interactive

Mixture modelWorkspace (between subject)VisibleInvisibleFaces (within subject)VisibleInvisibleNo restrictions in time and talkSlide10

Non-interactive

Only one conditionDirector records instructionsNo time or talk constrainsPrototype can be examined as long as wanted before recordingBuilders listen to instructionsNo constrains on actionsStart, stop, rewindSlide11

Results

EfficiencyTurnsGestures and groundingDeictic expressionsGestures by addresseesCross-timing of actionsTiming strategiesVisual monitoringSlide12

Efficiency

Visibility of workspace improves efficiencySlide13

Efficiency

Non-interactiveTime needed to build much longer(245s “n-i” vs. 183s “i”)Strong drop in accuracyInadequate instructionsSlide14

Turns

Fewer SPOKEN turns of builder when workspace is visibleSlide15

Deictic expressions

Mainly unusable when workspace hiddenJoint attention neededonly referring to before mentioned situationSlide16

Gestures by addressees

Mostly accompanied by deictic utterances (if any)Explicit verdict usually only on such utterances (otherwise continuing)Slide17

Cross-timing

Gestural signalsReflect understanding at that momentSlide18

Cross-timing

Overlapping signalsUsually not in spoken dialogStart with “sufficient information”Slide19

Cross-timing

ProjectingPrediction of following actions/instructionsSlide20

Cross-timing

Initiation timeWaiting for partner to be able to attend the following utteranceSlide21

Cross-timing

Time uptakeResponses have to be timed exactly to the action and situationSlide22

Timing strategies

Self-interruptionDealing with evidence from the addresseeUsually not continuedSlide23

Timing strategies

Collaborative referencesDeictic references rely on addressees actionsSlide24

Visual monitoring

Mainly used when director reaches a problemEye gaze as supportSlide25

Conclusion

Grounding is fundamentalVisible workspace enhances grounding speedIn task-oriented dialogs faces are not importantCompensation possible (only if any monitoring is available)Slide26

Conclusion

Updating common groundIncrements are determined jointlyMuch evidence for bilateral accountAddressees provide statement about current understandingSpeakers monitor to update and change utterancesSlide27

Conclusion

Opportunistic processOffering optionsSelf-interruptionsWaitingInstant revisionMulti-modal processSpeech and gestures are combined if possibleSpeech alone takes more timeSlide28

Remarks

Gaze only important for certain types of tasksMeasurement of time maybe outdated(“old” study)No contradicting studies(To some extend commonsense)Slide29

Gaze and Turn-

Taking Behavior in Casual Conversation InteractionsKristiina Jokinen, Hirohisa Furukawa, Masafumi Nishida and Seiichi YamamotoSlide30

Differences

Three-party dialogueNo instructional taskStronger focus on eye gazeSlide31

Research Question

How well can eye gaze help in predicting turn taking?What is the role of eye gaze when the speaker holds the turn?Is the role of eye gaze as important in three-party dialogs as in two-party dialogue?Slide32

Hypothesis

In group discussions, eye gaze is important in turn to management (especially in turn holding cases)The speaker is more influential than the other partners in coordinating interactions (selects the next speaker) Slide33

Method

Three-person conversational eye gaze corpusNatural conversationsBalanced familiarity (50% familiar; 50% unfamiliar)Balanced gender (male-only; female-only; mixed)Slide34

Method

28 conversations among Japanese students in their early 20’s with three participants eachEach conversation about 10 minutesEye gaze recorded for one participantSlide35

Method

Eye tracker fixed on table to remain naturalnessSlide36

MethodSlide37

Used data

Estimated at the last 300ms of an utterance if followed by a 500ms pauseSlide38

Used data

Dialog actsSpeech featuresValues of F0, etc.Eye gazeSlide39

ResultsSlide40

Conclusion

Speaker signals whether he intends to give the turn or hold it by using eye gazefixating listener vs. focusing attention somewhere Eye gaze in multi-participant conversation as important as in two-participant conversationsSlide41

Conclusion

Eye gaze is used to select next speaker (seems to be correct)Maybe Japanese data interferes with value of speech dataComparison Study?Listeners focus on speaker not vice versaSlide42

Remarks

Vague information and data presentationAlthough various data exists, interaction of factors is not presentedSome conclusions rely on the before mentioned pointSetup only takes one participant in considerationMuch of the data was unusedLack in quality and way of creationSlide43

Remarks

Study is based on data for another studySetup is not optimalRealistic designYet, contains biasing flaws (situation of the participants, only one eye tracker)Slide44

Comparison

Clark and Krych present interesting ideas but eye gaze is only rarely handledHow could this be altered? Jokinen et al. focus on eye gaze in a (more or less) natural situation but lack in scientific results and setupWhat points and ideas of this setup could be beneficial?