Peng Li Jing Jiang Yinglin Wang Shanghai Jiao Tong University Singapore Management University An Example Entity Summary July 13 2010 ACL 2010 2 An Example Entity Summary July 13 2010 ID: 1032692
Download Presentation The PPT/PDF document "Generating Templates of Entity Summaries..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern MiningPeng Li, Jing Jiang, Yinglin WangShanghai Jiao Tong UniversitySingapore Management University
2. An Example Entity SummaryJuly 13, 2010ACL 20102
3. An Example Entity SummaryJuly 13, 2010ACL 20103… received … Nobel Prize …His … contributions … include …… published … works …
4. Entity Summaries in the Same CategoryJuly 13, 2010ACL 20104
5. Entity Summaries in the Same CategoryJuly 13, 2010ACL 20105AspectRepresentative Sentence PatternsEducationX received his PhD from ___ UniversityX studied ___ under ___X earned his ___ in physics from University of ___AwardsX was awarded the medal in ___X won the ___ awardX received the Nobel Prize in physics in ___CareerX was ___ directorX was the head of ___X worked for ___ContributionsX made contributions to ___X is best known for work on ___X is noted for ___Our task is to automatically generate such entity summary templates.
6. Why Is It Useful?Better organizes information unitsWikipedia infoboxesProvides a structured template for humans to create new entity summariesFacilitates automatic entity summary generationJuly 13, 2010ACL 20106
7. OutlineTask and motivationRelated workOur approachSentence clusteringPattern miningEvaluationConclusionsJuly 13, 2010ACL 20107
8. Related WorkFilatova et al. (2006) “Automatic creation of domain templates”Patterns must contain a non-auxiliary verbPatterns are not clustered into aspectsPattern slots are identified through heuristicsSauper & Barzilay (2009) “Automatically generating Wikipedia articles”Long, comprehensive articlesLDA extensionsChemudugunta et al. (2007)Titov & McDonald (2008)Daume III & Marcu (2006), Haghi & Vanderwende (2009)July 13, 2010ACL 20108
9. OutlineTask and motivationRelated workOur approachSentence clusteringPattern miningEvaluationConclusionsJuly 13, 2010ACL 20109
10. PhysicistOverview of Our ApproachJuly 13, 2010ACL 2010101. ENT won the ? award2. ENT …1. ENT … …2. ENT …1. ENT … …2. ENT …1. ENT received his PhD from ? University2. ENT …sentence clusteringpattern mining
11. OutlineTask and motivationRelated workOur approachSentence clusteringPattern miningEvaluationConclusionsJuly 13, 2010ACL 201011
12. Sentence ClusteringTo group sentences related to the same aspect togetherTo distinguish between aspect words and entity-specific words … graduated from the University of ChicagoJuly 13, 2010ACL 201012aspect word???entity-specific wordaspect word
13. Motivating ExampleJuly 13, 2010ACL 201013 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …
14. Motivating ExampleJuly 13, 2010ACL 201014 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …
15. Motivating ExampleJuly 13, 2010ACL 201015 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …venturimodena……Venturichicago1982…Anderson
16. Motivating ExampleJuly 13, 2010ACL 201016 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …venturimodena……Venturichicago1982…physicsphysicistresearch……backgroundAnderson
17. Motivating ExampleJuly 13, 2010ACL 201017 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …professorinstituteuniversity……affiliationventurimodena……Venturichicago1982…Andersonphysicsphysicistresearch……background
18. Motivating ExampleJuly 13, 2010ACL 201018 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …professorinstituteuniversity……affiliationventurimodena……Venturichicago1982…physicsphysicistresearch……backgroundawardwonprize……awardAnderson
19. Motivating ExampleJuly 13, 2010ACL 201019 … … Venturi was a professor of physics at the University of Modena … … … … He was a professor of physics at the University of Chicago until 1982 … …φ1ψ1D1ψ2D2φBbackgroundφ2A1A2
20. Entity-Aspect ModelJuly 13, 2010ACL 201020DψdAφaφBwyπzθy Є {1, 2, 3}z Є {1, …, A}Nd????
21. Entity-Aspect ModelJuly 13, 2010ACL 201021DψdAφaφBwyπzθy Є {1, 2, 3}z Є {1, …, A}SdNd,sβγα
22. Model InferenceGibbs samplingJuly 13, 2010ACL 201022
23. Clustered SentencesJuly 13, 2010ACL 201023 Venturi/D was/S a/S professor/A of/S physics/B at/S the/S University/A of/S Modena/D ./S He/S was/S a/S professor/A of/S physics/B at/S the/S University/A of/S Chicago/D until/S 1982/D ./S … … S: stop wordB: background wordA: aspect wordD: document word
24. OutlineTask and motivationRelated workOur approachSentence clusteringPattern miningEvaluationConclusionsJuly 13, 2010ACL 201024
25. Pattern MiningUse heuristics to locate subject entitiesTitle of the Wikipedia articleTop 3 frequent subject noun phrases in the articleGenerate labeled dependency parse treesReplace document words with “?”Mine frequent subtree patternsPrune patternsRemove patterns with no subject entityRemove patterns with no aspect wordConvert subtree patterns to sentence patternsJuly 13, 2010ACL 201025
26. A Labeled Dependency Parse TreeJuly 13, 2010ACL 201026
27. Sample Aspects and PatternsJuly 13, 2010ACL 201027AspectSample Sentence Patterns1X received his PhD from ___ UniversityX studied ___ under ___X earned his ___ in physics from University of ___2X was awarded the medal in ___X won the ___ awardX received the Nobel Prize in physics in ___3X was ___ directorX was the head of ___X worked for ___4X made contributions to ___X is best known for work on ___X is noted for ___
28. OutlineTask and motivationRelated workOur approachSentence clusteringPattern miningEvaluationConclusionsJuly 13, 2010ACL 201028
29. Data SetWikipedia articles from 5 categoriesJuly 13, 2010ACL 201029CategoryDocumentsSentencesAvg Sent/DocUS Actress40717214Physicist69742386US CEO17910405US Company37524776Restaurant15211957
30. Quantitative EvaluationSentence patternsManually judged whether each sentence pattern is meaningful for the given entity categoryAllows us to compute precision, recall and f1Baseline 1: pattern mining without entity-aspect modelBaseline 2: verb-based pruning (Filatova et al. 2006) on top of BL-1Aspect clustersManually grouped meaningful patterns into clustersAllows us to compute purityJuly 13, 2010ACL 201030
31. Quality of Sentence PatternsNo consideration of aspect clusters.High precision of BL-1 and BL-2: They use a higher frequency threshold to select frequent sentence patterns.Low recall of BL-2: Many meaningful sentence patterns do not contain a non-auxiliary verb.BL-1 and BL-2 do not generate template slots.July 13, 2010ACL 201031
32. Quality of Aspect ClustersJuly 13, 2010ACL 201032
33. Sample Aspects and Their Representative WordsJuly 13, 2010ACL 201033
34. ConclusionsWe proposed an LDA-based entity-aspect model to simultaneously cluster sentences and label wordsWe used pattern mining to identify sentence patterns with slotsWe empirically evaluated our method and showed its effectivenessJuly 13, 2010ACL 201034
35. Future DirectionsUse linguistic knowledge to further prune sentence patterns and improve their readabilityAutomatic aspect labelingAutomatic entity summary generationJuly 13, 2010ACL 201035
36. Thank You!July 13, 2010ACL 201036