Abdullah Alotayq Dong Wang Ed Pham A Basic QA System Passage Retrieval Outline Query Expansion Document Ranking Passage Retrieval Passage Reranking Query Expansion Two different methods ID: 330583
Download Presentation The PPT/PDF document "Deliverable 3" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Deliverable 3 Abdullah Alotayq, Dong Wang, Ed Pham
A Basic Q/A System:
Passage RetrievalSlide2
OutlineQuery ExpansionDocument RankingPassage RetrievalPassage Re-ranking Slide3
Query ExpansionTwo different methods:Target ConcatenationAdd the target for each question to the end of the question.Deletion/AdditionDeletion of wh-words + function wordsAddition of synonyms and hypernyms (via WordNet)Slide4
Query ExpansionDeletionItemFreq.Function words
144
Q-words
7
Low content verbs
30
Question Mark
1
181Slide5
Query ExpansionAdditionSynonymsHypernyms First AncestorMorphological variants WordNet as thesaurus: wordnet.morphySlide6
Document RetrievalUsing Indri/LemurRan both query reformulation/expansion approaches through the software.Took the top 50 documents per query.Slide7
Passage Retrieval Used Indri/LemurTook the top passage from each of the top 50 documents for each query.Query grammar#combine[passageWIDTH:INC]Default for system: 120 terms, 1000 terms windowSlide8
Passage Re-rankingModified the window size500, 1000 termsModified the number of top passages taken from the top 50 documents:1, 5, 10, 20, 25 passagesSlide9
Evaluation Document rankingNote: All results based on TREC-2004QE ApproachMAP
Target Concatenation
0.3223
Subtraction +
WordNet
0.2381Slide10
Evaluation Passage RetrievalQE ApproachTypeMRR
Target Concatenation
Strict
0.195439095783
Lenient
0.392501775644
Subtraction + WordNet
Strict
0.180698539579
Lenient
0.341009813194Slide11
EvaluationPassage re-ranking: Top N passagesNTypeMRR
1
Strict
0.117647058824
Lenient
0.313725490196
5
Strict
0.183088235294
Lenient
0.375408496732
10
Strict
0.190662931839
Lenient
0.386188920012
20
Strict
0.193482690336
Lenient
0.390536326633
25
Strict
0.194581567305
Lenient
0.391579287296Slide12
EvaluationPassage Re-ranking: Window SizeWindow SizeTypeMRR
1000
Strict
0.195439095783
Lenient
0.392501775644
500
Strict
0.209317276517
Lenient
0.383193743722
100
Strict
0.340829170969
Lenient
0.48166863823Slide13
Conclusions“Less is Better”… for the most part.Query Expansion was not beneficial in improving passage retrieval.Smaller window size contributed to higher scores.Not the case for the top N passages thoughLess passages resulted in lower scoresMainly because of less passages to work withSlide14
Issues and Future ImprovementsRun timesPoor performance times for “addition/subtraction” query expansion approachToo broad of a queryReduce the number of hypernyms/synonymsLimited documentsOnly did 50, could have done moreSame with passagesSlide15
Issues and Future ImprovementsQuery GrammarChange it to assist in passage re-rankingExamples#scorepassage lengthdifferent weights for different termsSlide16
ReadingsQuery Expansion/ReformulationKwok, Etzioni, and Weld, 2001Lin, 2007Fang, 2008Aktolga et al, 2011Passage RetrievalTiedemann et al, 2008Indri/Lemur documentationSlide17
ExplorationsCELEX English, Dutch, German Lexical resourceBeneficial for adding Derivational variantsSepia MIT developed Symantec systemSemantic Parsing for Named Entities Both not available online Query Expansion Techniques for Question Answering, by Matthew W. Bilotti