in 3 Hours Stephen Soderland John Gilmer Rob Bart Oren Etzioni Daniel S Weld Turing Center University of Washington 11182013 TACKBP Workshop 1 11182013 TACKBP Workshop 2 Open IE ID: 261781
Download Presentation The PPT/PDF document "Open IE to KBP Relations" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Open IE to KBP Relations in 3 Hours
Stephen SoderlandJohn Gilmer, Rob Bart, Oren Etzioni, Daniel S. WeldTuring CenterUniversity of Washington
11/18/2013
TAC-KBP Workshop
1Slide2
11/18/2013TAC-KBP Workshop2
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)Slide3
11/18/2013TAC-KBP Workshop3
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died in , his Palo Alto home)Slide4
11/18/2013TAC-KBP Workshop4
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)Slide5
11/18/2013TAC-KBP Workshop5
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)
“Hamas denied responsibility for the attacks , which threaten to derail
ongoing peace talks.”
Arg1
Rel
Arg2
(
Hamas
, denied
responsibility
for, the
attacks
)Slide6
11/18/2013TAC-KBP Workshop6
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)
“Hamas denied responsibility for the attacks , which threaten to derail
ongoing peace talks.”
Arg1
Rel
Arg2
(
Hamas
, denied
responsibility
for, the
attacks
)
(the attacks , threatened to derail, ongoing peace talks)Slide7
11/18/2013TAC-KBP Workshop7
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)
“Hamas denied responsibility for the attacks , which threaten to derail
ongoing peace talks.”
Arg1
Rel
Arg2
(
Hamas
, denied
responsibility
for, the
attacks
)
(the attacks , threatened to derail, ongoing peace talks)
“
Ribosomes ,
which are complexes made
of
ribosomal RNA and protein,
are
the cellular components that carry out protein synthesis
.”
Arg1
Rel
Arg2
(
Ribosomes , are
complexes made
of , ribosomal
RNA and protein
)Slide8
11/18/2013TAC-KBP Workshop8
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)
“Hamas denied responsibility for the attacks , which threaten to derail
ongoing peace talks.”
Arg1
Rel
Arg2
(
Hamas
, denied
responsibility
for, the
attacks
)
(the attacks , threatened to derail, ongoing peace talks)
“
Ribosomes ,
which are complexes made
of
ribosomal RNA and protein,
are
the cellular components that carry out protein synthesis
.”
Arg1
Rel
Arg2
(
Ribosomes , are
complexes made
of , ribosomal
RNA and protein
)
(
Ribosomes ,
are , the cellular components)Slide9
11/18/2013TAC-KBP Workshop9
Open IE“Steve Jobs, the co-founder of Apple, died of cancer in his Palo Alto home.” Arg1
Rel
Arg2(
Steve
Jobs , died of , cancer)
(Steve Jobs , died
in
,
his
Palo Alto home
)(Steve Jobs ,
is
co-founder
of , Apple)
“Hamas denied responsibility for the attacks , which threaten to derail
ongoing peace talks.”
Arg1
Rel
Arg2
(
Hamas
, denied
responsibility
for, the
attacks
)
(the attacks , threatened to derail, ongoing peace talks)
“
Ribosomes ,
which are complexes made
of
ribosomal RNA and protein,
are
the cellular components that carry out protein synthesis
.”
Arg1
Rel
Arg2
(
Ribosomes , are
complexes made
of , ribosomal
RNA and protein
)
(
Ribosomes ,
are , the cellular components)
(
Ribosomes ,
carry out , protein synthesis)Slide10
Advantages of Open IERobustMassively scalableWorks out of the boxFinds whatever relations are expressed in the textNot tied to an ontology of relations
DisadvantagesFinds whatever relations are expressed in the textNot tied to an ontology of relationsChallengeMap Open IE to an ontology of relationsMinimum of user effort11/18/2013TAC-KBP Workshop10
github/knowitall/openieSlide11
11/18/2013TAC-KBP Workshop11
per:cause_of_death: (Steve Jobs , died of cancer)(Steve Jobs , died
from , cancer)(
Steve Jobs , passed away from ,
cancer)
(Steve
Jobs ,
succumbed to
,
cancer)(cancer , killed , Steve Jobs)
…(cancer , claimed the life of Steve Jobs)
(
Steve Jobs ,
lost his battle to
, cancer
)
(Steve
Jobs ,
was a victim of
cancer
)
(Steve Jobs ,
could not beat
,
cancer
)(Steve Jobs , could not have prevented ,
his death
from
cancer)
(Steve Jobs , joins the ranks of
cancer fatalities
)
…Head:high frequencyLong tail:
low frequencySlide12
OutlineRules to map to target relationsRule languageSemantic taggers
KBP systemArchitecture3 hour rule set vs. 12 hour rule setResults and discussionFuture work11/18/2013TAC-KBP Workshop12Slide13
Desiderata for Target Relation MappingWorks even if no annotated trainingUser may have limited skill
in NLP and MLRules are understandable to userHigh precision and good generalizationApproach:Manually created rules based on Open IE tuplesSimple rule languageRules combine lexical and semantic type constraintsExtensible semantic types based on keyword tagger
11/18/2013
TAC-KBP Workshop
13Slide14
Rule language11/18/2013TAC-KBP Workshop
14(Smith, was appointed, Acting Director of Acme Corporation)
entity slotfill
Terms in Rule Example
Target relation:
per:employee_or_member_of
Query entity in: Arg1
Slotfill
in: Arg2
Slotfill type:
Organization
Arg1
terms: -
Relation terms: appointed
Arg2 terms: <JobTitle>
of
Functional? noSlide15
Rule language11/18/2013TAC-KBP Workshop
15(Smith, was appointed, Acting Director of Acme Corporation)per:employee_or_member_of (Smith, Acme Corporation)
Terms in Rule Example
Target relation:
per:employee_or_member_of
Query entity in: Arg1
Slotfill
in: Arg2
Slotfill type:
Organization
Arg1
terms: -
Relation terms: appointed
Arg2 terms: <JobTitle>
of
Functional?
noSlide16
Semantic TaggingGeneral typesPerson, Organization, Location, DateNER taggerWordNetUser-specified types
Keyword tagger User creates file of terms for the semantic typeTaggers takes file as inputUsed lists from CMU’s NELL for KBP11/18/2013TAC-KBP Workshop16github/knowitall/taggersSlide17
Semantic Types from CMU’s NELL4K Job titles academic coordinator … zonal underwriting manager182
Head job titles acting chief director … vice-director47 ReligionsAdventist … Zoroastrianism114 Nationalities Akkadian … Zambian5K Cities: Aachen … Zwolle536 State-provinces: Ad Dali … Zlitan241 Countries: Afghanistan … Zimbabwe
11/18/2013
TAC-KBP Workshop
17Slide18
OutlineRules to map to target relationsRule languageSemantic taggers
KBP systemarchitecture3 hour rule set vs. 12 hour rule setCo-referenceResults and discussionFuture work11/18/2013TAC-KBP Workshop18Slide19
KBP Architecture11/18/2013TAC-KBP Workshop
19
200M tuplesSlide20
What We Did Not HandleEntity disambiguation needed for KBP precisionGood extraction for “Paul Gray”, but wrong Paul GrayMostly ignored this in our system
Find any tuple that matched entity stringDetect ambiguous entities if linked to multiple KB entriesDiscard all results for ambigous entities11/18/2013TAC-KBP Workshop20Slide21
Creating Rule Sets3 Hour Rules setAvg 3 rules per relationLight editing of NELL keyword listsper:cause_of_death =
“died of”, “died from”, “died as a result of”, “died due to”12 Hour Rules set (over two week period)Avg 16 rules per relationRefined rules, testing on 2012 KBP answer keyFurther editing of NELL keyword listsper:cause_of_death = “die of”, “dies of”, “dying of”, … “succumbed to”, “succumbs to”, …
11/18/2013
TAC-KBP Workshop21Slide22
OutlineRules to map to target relationsRule languageSemantic taggers
KBP systemarchitecture3 hour rule set vs. 12 hour rule setCo-referenceResults and discussionFuture work11/18/2013TAC-KBP Workshop22Slide23
KBP Results11/18/2013TAC-KBP Workshop
23Extractor Precision: per:title(Paul Gray, bassist) per:title(Paul Gray, president)KBP Precision: per:title(Paul
Gray, bassist)
per:title(Paul Gray, president)
35% recall boost from 12
hoursSlide24
Error Analysis31% “Looked right to me”“Tantawi was the grand sheik” => per:title(Tantawi, sheik)“ETA's
political wing Batasuna” => org:subsidiary(ETA, Batasuna)23% Overgeneralized rules“Ginzburg was an outspoken critic” => per:title(Ginzburg, critic)“Meredith led the NFL in scoring” => per:employee_or_member_of(Meredith, NFL)19% Rules matched on non-head terms“Kahn’s younger sister married Shankar” => per:spouse(Kahn, Shankar)15% Open IE errors12% Coref errors
11/18/2013TAC-KBP Workshop
24Slide25
Ceiling for Recall from Open IE42% Extracts all information for KBP relation16% Extractor
truncates an argument Omits appositive or parenthetical “Sheikh Tantawi, the top Egyptian cleric who died on Wednesday…” (the top Egyptian cleric , died on, Wednesday)10% Extractor misses “relational noun”“Tantawi, the Grand Imam of Al-Azhar”
10% No extraction of relevant part of sentence
Syntactic complexity 4% Extraction error
18%
Other
11/18/2013
TAC-KBP Workshop
25
68%Slide26
Future WorkIncrease recall of Open IEIncrease precision of rule applier
General method not tied to KBP taskPlug in any ontology of relationsResults not tied to query entityRelease as open-source software11/18/2013TAC-KBP Workshop26Slide27
ConclusionNovel approach for KBP Slot FillingRun Open IE extractor on corpus
Semantic taggers based on user-written keyword listsUser-written rules to map target relations to Open IEResults High extraction precision 0.80Moderate recall 0.10 (comparable to all but top sites)Low human effortRequires no NLP or ML experienceOnly 3 hours effort gives high precision11/18/2013
TAC-KBP Workshop
27Slide28
Thank yougithub/knowitall/openiegithub/knowitall/taggers
11/18/2013TAC-KBP Workshop28