Torsten Jachmann 16122013 Herbert H Clark and Meredyth A Krych Seminar Gaze as function of instructions and vice versa Research Question Speaking and listening in dialog ID: 201782
Download Presentation The PPT/PDF document "Speaking while monitoring addressees for..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Speaking while monitoring addressees for understanding
Torsten Jachmann16.12.2013
Herbert H. Clark and Meredyth A. Krych
Seminar „Gaze
as
function
of
instructions
-
and
vice
versa
“Slide2
Research Question
Speaking and listening in dialogUnilateralSpeakers and listeners act autonomousNo interactionBilateralSpeakers and listeners monitor their respective partnerJoint activityWhat do speakers monitor?How do they use that information?Slide3
Grounding
Level 1Attend to vocalizationLevel 2Identify words, phrases and sentencesLevel 3Understand the meaningLevel 4Consider answeringSlide4
Grounding
A: Where you there when they erected the new signs?B: Th… which new signs? (Level 3)A: Little notice boards, indicating where you had to go for everythingB: No. Bilateral accountSlide5
Monitoring
VoicesAttendance to partners utterancesFacesGaze and facial expressions as indicator for understandingWorkspacesRegion in front of the bodyManual gestures (but also games, etc.)Slide6
Monitoring
BodiesHead and torso movement as indicatorShared Scenes Scenery beyond workspaceSignals vs. SymptomsSignals are constructed to get meaning acrossSymptoms are not intentionally createdSlide7
Least joint effort
OpportunisticSelection of the available methods that take the least effort to produce“Tailored”Overhearers (not monitored by speaker) may misunderstand utterancesSlide8
Method
Pairs of directors and builders76 students (34 male / 42 female)Instructions to build 10 simple Lego Models2 x 2 design (interactive)28 pairsAdditional non-interactive condition10 pairsVideo and audio analysesSlide9
Interactive
Mixture modelWorkspace (between subject)VisibleInvisibleFaces (within subject)VisibleInvisibleNo restrictions in time and talkSlide10
Non-interactive
Only one conditionDirector records instructionsNo time or talk constrainsPrototype can be examined as long as wanted before recordingBuilders listen to instructionsNo constrains on actionsStart, stop, rewindSlide11
Results
EfficiencyTurnsGestures and groundingDeictic expressionsGestures by addresseesCross-timing of actionsTiming strategiesVisual monitoringSlide12
Efficiency
Visibility of workspace improves efficiencySlide13
Efficiency
Non-interactiveTime needed to build much longer(245s “n-i” vs. 183s “i”)Strong drop in accuracyInadequate instructionsSlide14
Turns
Fewer SPOKEN turns of builder when workspace is visibleSlide15
Deictic expressions
Mainly unusable when workspace hiddenJoint attention neededonly referring to before mentioned situationSlide16
Gestures by addressees
Mostly accompanied by deictic utterances (if any)Explicit verdict usually only on such utterances (otherwise continuing)Slide17
Cross-timing
Gestural signalsReflect understanding at that momentSlide18
Cross-timing
Overlapping signalsUsually not in spoken dialogStart with “sufficient information”Slide19
Cross-timing
ProjectingPrediction of following actions/instructionsSlide20
Cross-timing
Initiation timeWaiting for partner to be able to attend the following utteranceSlide21
Cross-timing
Time uptakeResponses have to be timed exactly to the action and situationSlide22
Timing strategies
Self-interruptionDealing with evidence from the addresseeUsually not continuedSlide23
Timing strategies
Collaborative referencesDeictic references rely on addressees actionsSlide24
Visual monitoring
Mainly used when director reaches a problemEye gaze as supportSlide25
Conclusion
Grounding is fundamentalVisible workspace enhances grounding speedIn task-oriented dialogs faces are not importantCompensation possible (only if any monitoring is available)Slide26
Conclusion
Updating common groundIncrements are determined jointlyMuch evidence for bilateral accountAddressees provide statement about current understandingSpeakers monitor to update and change utterancesSlide27
Conclusion
Opportunistic processOffering optionsSelf-interruptionsWaitingInstant revisionMulti-modal processSpeech and gestures are combined if possibleSpeech alone takes more timeSlide28
Remarks
Gaze only important for certain types of tasksMeasurement of time maybe outdated(“old” study)No contradicting studies(To some extend commonsense)Slide29
Gaze and Turn-
Taking Behavior in Casual Conversation InteractionsKristiina Jokinen, Hirohisa Furukawa, Masafumi Nishida and Seiichi YamamotoSlide30
Differences
Three-party dialogueNo instructional taskStronger focus on eye gazeSlide31
Research Question
How well can eye gaze help in predicting turn taking?What is the role of eye gaze when the speaker holds the turn?Is the role of eye gaze as important in three-party dialogs as in two-party dialogue?Slide32
Hypothesis
In group discussions, eye gaze is important in turn to management (especially in turn holding cases)The speaker is more influential than the other partners in coordinating interactions (selects the next speaker) Slide33
Method
Three-person conversational eye gaze corpusNatural conversationsBalanced familiarity (50% familiar; 50% unfamiliar)Balanced gender (male-only; female-only; mixed)Slide34
Method
28 conversations among Japanese students in their early 20’s with three participants eachEach conversation about 10 minutesEye gaze recorded for one participantSlide35
Method
Eye tracker fixed on table to remain naturalnessSlide36
MethodSlide37
Used data
Estimated at the last 300ms of an utterance if followed by a 500ms pauseSlide38
Used data
Dialog actsSpeech featuresValues of F0, etc.Eye gazeSlide39
ResultsSlide40
Conclusion
Speaker signals whether he intends to give the turn or hold it by using eye gazefixating listener vs. focusing attention somewhere Eye gaze in multi-participant conversation as important as in two-participant conversationsSlide41
Conclusion
Eye gaze is used to select next speaker (seems to be correct)Maybe Japanese data interferes with value of speech dataComparison Study?Listeners focus on speaker not vice versaSlide42
Remarks
Vague information and data presentationAlthough various data exists, interaction of factors is not presentedSome conclusions rely on the before mentioned pointSetup only takes one participant in considerationMuch of the data was unusedLack in quality and way of creationSlide43
Remarks
Study is based on data for another studySetup is not optimalRealistic designYet, contains biasing flaws (situation of the participants, only one eye tracker)Slide44
Comparison
Clark and Krych present interesting ideas but eye gaze is only rarely handledHow could this be altered? Jokinen et al. focus on eye gaze in a (more or less) natural situation but lack in scientific results and setupWhat points and ideas of this setup could be beneficial?