Andreas Dengel Ludger van Elst German Research Center for AI DFKI Knowledge Management Department Kaiserslautern Germany SIGIR 08 Query Expansion Using GazeBased Feedback on the Subdocument Level ID: 293774
Download Presentation The PPT/PDF document "Georg Buscher" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Georg Buscher, Andreas Dengel, Ludger van ElstGerman Research Center for AI (DFKI)Knowledge Management DepartmentKaiserslautern, Germany
SIGIR 08
Query Expansion Using
Gaze-Based Feedback on the
Subdocument LevelSlide2
Motivation
Reading detection and document annotation techniqueImplicit feedback methods
Study design
Results
Outline
/Slide3
OutlineMotivationReading detection and document annotation techniqueImplicit feedback methods
Study design
Results
/Slide4
Background and MotivationRelevance feedback à la Rocchio is well understoodFeedback is mostly applied for entire documentsPrecision presumably gets better when acquiring feedback on the subdocument levelDrawbacks of such fine-grained feedback:Too much cognitive load for explicit feedbackToo little implicit feedback data through explicit interactions (e.g. highlighting)
document
Relevance feedback
on the document level
/
Relevance feedback
on the subdocument level
Use
eye gaze as source for implicit feedback on the subdocument
levelSlide5
OutlineMotivation
Reading detection and document annotation technique
Implicit feedback methods
Study design
ResultsSlide6
Eye TrackingUnobtrusiveRelatively precise(accuracy: 1° of visual angle)ExpensiveMostly used as „passive“ tool for behavior analysis, e.g. visualized by heatmaps:
We use eye tracking for immediate implicit feedback taking into account temporal fixation patternsSlide7
Reading Detection
Starting point: Noisy gaze data from the eye tracker.
Fixation detection and saccade classification
Reading (red) and skimming (yellow) detection line by line
See
G. Buscher, A. Dengel, L. van Elst: “Eye Movements as Implicit Relevance Feedback”, in CHI '08Slide8
Gaze-Based Document Meta Data
Store reading information as document annotations in a semantic Wiki
Line-matching by applying optical character recognition
See
G. Buscher, A. Dengel, L. van Elst, F.
Mittag
: “Generating and Using Gaze-Based Document Annotations”, in CHI '08Slide9
OutlineMotivationReading detection and document annotation techniqueImplicit feedback methods
Study design
ResultsSlide10
Implicit Relevance Feedback for Query Expansion
Input: viewed documents having one specific task in mind
Find
terms
that
best
describe
the
user‘s
current
interest
.
Use
these
terms
for
query
expansion
task / information need
context
terms describing the
user‘s current interest /
contextSlide11
Three Implicit Feedback Methods to Evaluate
Input:
viewed
documents
Gaze-Filter
TF x IDF
Ga
ze-Length
-
Filter
Interest(t) x
TF
x
IDF
based on length of coherently read text
based on read or skimmed passagesSlide12
Gaze-Length-Filter# long read or skimmed passages containing t
Interest(t) =#
all
read or skimmed passages containing t
Long passages are passages containing at least 230 characters
(i.e. more than the following two lines).
The heuristic assumes that shorter text parts only rarely convey sophisticated concepts to the reader.
It further assumes that readers are generally not very interested in the contents of short read or skimmed text parts. Therefore all terms contained in short read or skimmed text parts get a lower interest value.Slide13
Three Implicit Feedback Methods to Evaluate
Input:
viewed
documents
Gaze-Filter
TF x IDF
Ga
ze-Length
-
Filter
Reading
Speed
ReadingScore
(t) x
TF x IDF
based on read vs. skimmed passages containing term t
based on read or skimmed passages
Interest(t) x
TF
x
IDF
based on length of coherently read textSlide14
Reading SpeedP are all read or skimmed passages containing term t.The heuristic assumes that more thoroughly read text parts (and therefore their terms) are more likely to be of interest to the user than cursorily viewed parts.1
ReadingScore(t) =
|P |
t
Σ
p
є
P
t
r(p)
tSlide15
Three Implicit Feedback Methods to Evaluate
Input:
viewed
documents
Baseline
TF x IDF
Gaze-Filter
TF x IDF
Ga
ze-Length
-
Filter
Reading
Speed
ReadingScore
(t) x
TF x IDF
based on read vs. skimmed passages containing term t
based on opened
entire
documents
based on read or skimmed passages
Interest(t) x
TF
x
IDF
based on length of coherently read textSlide16
OutlineMotivationReading detection and document annotation techniqueImplicit feedback methods
Study design
ResultsSlide17
Study DesignInformational task given2 different tasksTask description in simulated emailParticipants had to imagine being journalists
Read pre-selected documentsEmail attachmentsDocument structure carefully chosen
Search for more information on Wikipedia
3 different queries:
main topic, sub-topic, related topic
Give relevance feedback for the first
20 result entries per query
Read about topic in email
Look through 4 email
attachments to get
started with the topic
Find more information
by querying search engine
Give explicit relevance
feedback
3x
2xSlide18
Topic: perceptual organs of animalsPre-selected documents: 4 Wikipedia articles about cats, sharks, dogs, batsThe articles described all facets of the species.Each article contained several paragraphs dealing with perception-related
issues.3 different queriesMain topic query: more material about perceptionSub-topic query: more material about visual perceptionRelated-topic query: perceptual organs for the earth‘s magnetic field
Task ExampleSlide19
Result List GenerationCreate basic result listCreate expanded queries(+ top 50 terms)Re-rank that list for every query expansion variantMerge the re-ranked result lists in a balanced, ordered wayPresent merged list to the participant
User query
Variation: Baseline
Variation: Gaze-Filter
Variation: Gaze-Length-Filter
Variation: Reading-Speed
Re-ranked list 1
Re-ranked list 2
Re-ranked list 3
Re-ranked list 4
Expanded query 1
Expanded query 2
Expanded query 3
Expanded query 4
Result list
Merged result list
Viewed
documents
UserSlide20
OutlineMotivationReading detection and document annotation techniqueImplicit feedback methods
Study design
ResultsSlide21
Overview21 participants60-80 minutes per participant111 issued user queries2220 explicit relevance ratingsDistribution of the relevance ratingsSlide22
Precision and Discounted Cumulative Gain (DCG)Slide23
Mean Average PrecisionPowerful improvement of all gaze-based variants over the baselineReading-Speed variant is less effective than GF and GLFGLF might be a bit better than GF?
** : p < 0.01 * : p < 0.05 (*): p < 0.1
(
two-tailed paired t-test)Slide24
Query Type DifferentiationGenerally similar trend within each query typeMAP consistently decreases from main topic to sub topic to related topic queriesNarrow information needs especially for related topic queriesWikipedia did not contain too many relevant pagesMAP of the Baseline decreases much more (-0.25)compared to GF (-0.17), GLF (-0.18)
Asterisks mark significance of improvement over
the baseline
B: Baseline
GF: Gaze-Filter
GLF: Gaze-Length-F.
RS: Reading-SpeedSlide25
Pages about animal species
Inappropriate Context
The baseline method extracts terms that might be far away from the user‘s current topic of interest.
Expanding the query with these terms can lead in a wrong and for the user unpredictable direction.
The more distant the topic of the user’s next query is (i.e. related topic query), the more negative is the effect of unsuitable terms for expanding the query.
Animal perception
Parts of
animal perception
(e.g. only visual and
auditory perception)
Gaze-based methods
Animal species
Baseline methodSlide26
ConclusionGaze data can effectively be analyzed and used as a source for implicit feedbackReading behavior detection on its own provides useful information for query expansion and re-rankingPrecision can be improved just by adding those terms to a query that have been read beforeFuture WorkMore realistic web search scenarios (e.g. not only on Wikipedia)More sophisticated heuristics for interpreting gaze-based feedback
Gaze also for long-term implicit feedback (e.g. desktop search)Slide27
Interested?Interested in implicit feedback for personalization?E.g. scrolling behavior, click-through, mouse movements, eye tracking, EEG, bio sensors, emotions, magic, …Please
let me know!georg.buscher@dfki.de Workshop?Slide28
Thank you for your attention!
Special thanks for the
travel grant
by
- ACM SIGIR
-
Amit
Singhal
made in honor of Donald B. Crouch
- Microsoft Research
made in honor of Karen
Sparck
Jones