SPred Largescale Harvesting of Semantic Predicates Cup of Over 225 billion cups of coffee are consumed in the world every day 2 SPred Largescale Harvesting of Semantic Predicates ID: 1040102
Download Presentation The PPT/PDF document "Tiziano Flati and Roberto Navigli" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Tiziano Flati and Roberto NavigliSPred: Large-scale Harvesting of Semantic PredicatesCup of
2. Over 2.25 billioncups of coffee are consumed in the world every day“”2SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
3. cup of *3SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
4. cup of *4SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
5. cup of * Objective:5SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
6. Challenge #1: discovering representative arguments6SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
7. Challenge #2: inferring semantic classescup of * 7SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
8. LEXICALPATTERNSX such as Y[Resnik ‘96,Erk ‘07,Chambers & Jurasky ‘10][Hearst 92,Kozareva & Hovy ‘10,Wu & Weld ‘10]EATMEATGASFISHICE CREAMSELECTIONALPREFERENCES8SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
9. [Resnik ‘96,Erk ‘07,Chambers & Jurasky ‘10][Hearst 92,Kozareva & Hovy ‘10,Wu & Weld ‘10]EATMEATGASFISHICE CREAMSELECTIONALPREFERENCESLEXICALPATTERNSX such as Y9SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
10. [Resnik ‘96,Erk ‘07,Chambers & Jurasky ‘10][Hearst 92,Kozareva & Hovy ‘10,Wu & Weld ‘10]EATMEATGASFISHICE CREAMSELECTIONALPREFERENCESLEXICALPATTERNSX such as Y10SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
11. Challenge #1: discovering representative argumentsChallenge #2: inferring semantic classesSPred11SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
12. Challenge #2: inferring semantic classesSPredCONTRIBUTION # 1Capturing concepts for long tail arguments using a novel wikification procedure12SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
13. CONTRIBUTION # 1Capturing concepts for long tail arguments using a novel wikification procedureCONTRIBUTION # 2Inferring WordNet semantic classes from a distribution of Wikipedia pagesSPred13SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
14. METHODOLOGY14SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
15. WordNetWordNetHARVESTING ARGUMENTSFROM WIKIPEDIALINKING ARGUMENTSTO WIKIPEDIAAND WORDNETLINKING ARGUMENTSFROM WORDNET TO SEMANTIC CLASSES…cup of ** was designed bythe biggest * in 1987a very big *……cup of [Beverage][Structure] was designed bythe biggest [Event] in 1987a very big [Phenomenon]…15SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
16. cup of *LEXICAL PREDICATE16SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
17. Cup of coffeeLEXICAL PREDICATE cup of was designed by the biggest in 1987 a very big …****17SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
18. cup of coffeeFILLING ARGUMENT18SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
19. FILLING ARGUMENTCup of coffeered wineItalywas designed bywas designed byartisthotel…cup ofcup ofdressbridgea very biga very big…19SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
20. cup of [Beverage]SEMANTICPREDICATE[Liquid][Milk][Alcohol][Coffee][Irish coffee]Example output20SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
21. Cup of BeverageSEMANTIC PREDICATEcup ofcup of[Clothing][Platform]a very biga very big…[Beverage][Country]was designed bywas designed by[Artist][Building]…21SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
22. …cup of BeverageStructure was designed bythe biggest Event in 1987a very big Phenomenon…WordNetWordNetHARVESTING ARGUMENTSFROM WIKIPEDIALINKING ARGUMENTSTO WIKIPEDIAAND WORDNETLINKING ARGUMENTSFROM WORDNET TO SEMANTIC CLASSES…cup of ** was designed bythe biggest * in 1987a very big *…lexical predicatelexical predicateCLASSCLASSCLASSCLASS22SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
23. cup of *()23SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
24. cup of *coffeeteaItalymilkyeast…24SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
25. …cup of BeverageStructure was designed bythe biggest Event in 1987a very big Phenomenon…WordNetWordNetHARVESTING ARGUMENTSFROM WIKIPEDIALINKING ARGUMENTSTO WIKIPEDIAAND WORDNETLINKING ARGUMENTSFROM WORDNET TO SEMANTIC CLASSES…cup of ** was designed bythe biggest * in 1987a very big *…lexical predicate *lexical predicate[CLASS][CLASS][CLASS][CLASS]25SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
26. Earl grey tea cup ofcup ofEarl grey teacup ofEarl grey teacup of26SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
27. Research question #1: How to determine which Wikipedia page best corresponds to an argument?… and drank over twenty cups of coffee each day…?27SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
28. Wikipedians will occasionallylink the arguments for usWilliam G. McGowanHe was also a three-pack-a-day smoker and drank over twenty cups of coffee each day until his first heart attack. As leader of MCI, he labored for several years to gain the financing and …For free!28SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
29. All instances of ‘coffee’linkedProblem #1: Not many arguments are linked29SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli1134?
30. 113All instances of ‘coffee’How to link these instances?Problem #1: Not many arguments are linked4linked30SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli?
31. the greatest benefits were observed in those who drank coffee for a long period in their lifetime.[…]roughly 80 to 100 cups of coffee for an average adult taken within a limited time…1st heuristic: One sense per pageHealth effects ofcaffeineIf the argument text has been linked somewhere else in the article, use that link’s pageManually linkedOne senseper page31SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
32. Trust theinventory2nd heuristic: Trust the inventory1 senseonly!If there’s only one page for that argument text, link to that pagehis days in the library with a cup ofEarl Grey tea. The main character of the…32SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
33. Problem #2: Same argument linkedto multiple pages4278linkedlinkedAll instances of ‘water’?33SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli100%
34. Research question #2: How to determinewhich WordNet concepts best represent Wikipedia pages?cup of *() 34SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
35. NEs andspecialized concepts from WikipediaBabelNet: a mapping from Wikipedia pages to concepts[Navigli & Ponzetto, 2012] 35SPred: Large-scale Harvesting of Semantic PredicatesFlati, NavigliConcepts from WordNetConcepts integrated from both resources
36. Argument mapping Coffee is a brewed beverage with a distinct aroma and flavor, prepared from the roasted seeds…Coffee 36SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
37. Argument mappingThe vast majority of Wikipedia pages [4M+]do not have a corresponding concept in WordNet [117K+] = ? 37() SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
38. Argument mapping: hypernym extractionEarl Grey teaEarl Grey tea is a teawith a distinctive flavour and aroma derived from the addition of oil extracted from the rind of the bergamot orange, a fragrant citrus fruit. Traditionally, the term "Earl Grey“…Target lemmaHypernym extracted by WCLDefinitional sentenceTea is an aromatic beverage commonly prepared by pouring hot or boiling water…TeaWCL+ link[Navigli & Velardi, 2010]38SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
39. Argument mapping: an exampleWe can thus synergistically map to WordNet more than 500K pages!WCLIn literature, the main character in Haruki Murakami's Kafka on the Shore starts his days in the library with a cup of Earl Grey tea. The main character of the…Earl Grey tea is a tea with a distinctive flavour and aroma derived from…Earl Grey teaTea is an aromatic beverage commonly prepared by pouring hot or boiling water…TeaTrust the inventoryWCLBabelNet 39SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
40. …cup of BeverageStructure was designed bythe biggest Event in 1987a very big Phenomenon…WordNetWordNetHARVESTING ARGUMENTSFROM WIKIPEDIALINKING ARGUMENTSTO WIKIPEDIAAND WORDNETLINKING ARGUMENTSFROM WORDNET TO SEMANTIC CLASSES…cup of ** was designed bythe biggest * in 1987a very big *…lexical predicateSEMANTICPREDICATE40SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
41. Research question #3: how to generalize WordNet concepts associated with arguments?41SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
42. 3K+ most frequent concepts freely downloadableGeneralization to semantic classes{} {} {} {} {} {} {} CORECONCEPTSCore concepts of {} 42SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
43. 3K+ most frequent concepts freely downloadableGeneralization to semantic classes{} {} {} {} {} {} {} Core concepts of {} Semantic Class of {} CORECONCEPTS43SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
44. By repeating the same procedure for all thearguments of a lexical predicate we discover clusters of arguments for each semantic classGeneralization to semantic classesSemantic classIn literature, the main character in Haruki Murakami's Kafka on the Shore starts his days in the library with a cup of Earl Grey tea. The main character of the…Earl Grey tea is a tea with a distinctive flavour and aroma derived from…Earl Grey teaTea is an aromatic beverage commonly prepared by pouring hot or boiling water…TeaTrust the inventoryWCLBabelNet 44SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
45. earl grey teatea…waterseawater…coffeecappuccino…winewhite wine…Classes sorted by frequency!cup of *45SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
46. EVALUATION46SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
47. 1st EvaluationSemantic class ranking quality47SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
48. Experimental SetupLexical predicateArgumentprovide *mineralsgive birth to *childpublish *reviewbuild *suspence* collidecarget stuck in *traffic jamreduce *pollution……DATASET 150 randomlexical predicatesfromOxford AdvancedLearner's Dictionary48SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
49. Precision @ K[Wine][Feeling][Coffee][Water][Dairy product][Country]…ImportanceTop Ksemanticclasses# correctKP@K =49SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
50. Results for dataset 1K (semantic classes)Precision@K50SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
51. Experimental SetupDATASET 224 lexical patternsfromKozareva & Hovy 2010Lexical predicate work for ** work for fly to ** fly to go to ** go to* celebrate* dress…51SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
52. K (semantic classes) Precision@KK&HResults for dataset 252SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
53. 2nd EvaluationArgument disambiguation quality53SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
54. Lexical predicateArgumentprovide *mineralsgive birth to *childpublish *reviewbuild *suspence* collidecarget stuck in *traffic jamreduce *pollution……Experimental Setup54~ 800 lexical predicatessampled from theOxford AdvancedLearner’s Dictionary3,245 items manuallyannotated with themost suitablesemantic classDATASETSPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
55. ResultsPerformance55SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
56. SPred: a novel approach to large-scale harvesting of semantic predicatesContributionsWCLSPredWordNet56Novel heuristics for linking argumentsHigh performance argument classifierFreely available dataset of semantic predicatesSPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
57. http://lcl.uniroma1.it/spred/~ 1500 predicates57SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
58. Thanks or…m i58SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
59. Tiziano FlatiLinguistic Computing Laboratoryhttp://lcl.uniroma1.itJoint work with Roberto Navigli
60. Given a (new) argument for the lexical predicate , select the semantic class that maximizes the probability mixture Based on the semantic class ranking from WikipediaBased on context of occurrences of the argumentArgument classification60SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli
61. Argument linking statistics61SPred: Large-scale Harvesting of Semantic PredicatesFlati, Navigli