Deliverable 2 Jonggun Park Haotian He Maria Antoniak Ron Lockwood System architecture Two modules Indexing Querying query processing p assage retrieval answer processingranking ID: 604145
Download Presentation The PPT/PDF document "Question Answer System" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Question Answer SystemDeliverable #2
Jonggun Park
Haotian
He
Maria
Antoniak
Ron LockwoodSlide2
System architecture
Two modules:
Indexing
Querying
query processing
p
assage retrieval
answer processing/rankingSlide3
Document Indexing/Retrieval
Apache Lucene
Two
indices
Full text (used for
idf
calculations)
Paragraphs (used for scoring results)Slide4
Query processing
Why
did
Chuck Norris uppercut
a
horse
?
chuck,
norris
, uppercut, horse
SearchSlide5
Query processing
+
POS
+ NER
+ Chunking
+ StemmingSlide6
Chuck Norris uppercut a horse to make a giraffeSlide7
Answer Extraction/Processing
Initial solution is a redundancy-based strategy
With one big difference
Instead of using web queries for snippets
We are using results (top 100) from a Lucene querySlide8
Answer Extraction Details
Input to the Extraction Engine
Query word list
Stop-word list
Focus-word list (e.g. meters, liters, miles, etc.)
Passage list – the paragraph results of the query
N-gram generation and occurrence counting
Filtering out stop words and query words
Combining unigram counts with n-gram counts
Weighting candidates with
idf scoresVerifying candidates in documents
Lin,
J. 2007.
An
exploration of the principles underlying redundancy-based factoid question
answering
. Penn
Plaza, Suite 701, New York,
NY.Slide9
D2 Results
Strict = 0.01
Lenient = 0.064
Low results… but improvements are coming!Slide10
Future work
NER
Web boosting
Query/answer classificationSlide11
Thank you!
Questions?