for Informatics httpwwwmpiinfmpgdeweikum From Information to Knowledge Harvesting Entities Relationships and Temporal Facts from Web Sources Acknowledgements ID: 787640
Download The PPT/PDF document "Gerhard Weikum Max Planck Institute" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Gerhard Weikum Max Planck Institute for Informaticshttp://www.mpi-inf.mpg.de/~weikum/
From Information to Knowledge:
Harvesting
Entities
,
Relationships
,
and
Temporal Facts
from
Web
Sources
Slide2Acknowledgements
Slide3Goal: Turn Web into Knowledge Base
comprehensive DB of human knowledge everything that Wikipedia
knows
everything
machine-readable
capturing
entities
,
classes, relationships
Source:
DB & IR methods for
knowledge discovery.Communications of
the ACM 52(4), 2009
Slide4Approach: Harvesting Facts from Web
Politician
Political Party
Angela Merkel CDU
Karl-Theodor zu
Guttenberg
CDU
Christoph Hartmann FDP
…
Company
CEO
Google Eric Schmidt
Yahoo Overture
Facebook FriendFeed
Software AG IDS Scheer
…
Movie
ReportedRevenue
Avatar $
2,718,444,933
The Reader $
108,709,522
Facebook FriendFeed
Software AG IDS Scheer
…
PoliticalParty
Spokesperson
CDU
Philipp
Wachholz
Die Grünen Claudia Roth
Facebook
FriendFeed
Software AG IDS Scheer
…
Actor
Award
Christoph Waltz Oscar
Sandra Bullock Oscar
Sandra Bullock Golden
Raspberry
…
Politician
Position
Angela Merkel
Chancellor
Germany
Karl-Theodor zu
Guttenberg
Minister
of
Defense Germany
Christoph Hartmann Minister
of
Economy Saarland
…
Company
AcquiredCompany
Google
YouTube
Yahoo
Overture
Facebook
FriendFeed
Software AG IDS Scheer
…
YAGO-NAGA
IWP
Cyc
TextRunner
ReadTheWeb
WikiTax2WordNet
SUMO
Slide5Knowledge for Intelligence
entity recognition & disambiguation understanding natural language & speech knowledge services & reasoning for semantic apps (e.g. deep QA) semantic search:
precise
answers
to advanced queries
(by scientists, students, journalists, analysts, etc.)
FIFA 2010
finalists
who
played
in a Champions League final?
Politicians who
are also scientists?
Enzymes that
inhibit HIV? Influenza drugs for
teens with high blood pressure
?
...
German
football
coach
when
Bastian Schweinsteiger
was
born
?
Relationships
between
Manfred
Pinkal
,
Edsger
Dijkstra, Michael Dell,
and
Renee Zellweger?
Slide6Outline...
Automatic KB Construction
Growing
&
Maintaining
the
KB
Temporal
Knowledge
What
and
Why
Wrap-up
Slide7What is Knowledge (in a KB)?
...
facts
/
assertions
:
bornIn
(BastianSchweinsteiger,
Kolbermoor
),
hasWon (BastianSchweinsteiger, BronzeFIFAWorldCup2010), playedInFinal (BastianSchweinsteiger, ChampionsLeague2010), …
taxonomic: instanceOf (BastianSchweinsteiger, footballPlayer
), subclassOf (footballPlayer
, athlete), …
lexical / terminology:
means (“Big Apple“, NewYorkCity),
means (“Apple“, AppleComputerCorporation) means (“MS“, Microsoft) ,
means
(“MS“,
MultipleSclerosis
) …
common
-sense
properties
:
apples
are
green
,
red
,
juicy
,
sweet, sour
… - but not fast, smart … balls are round
, smooth, slippery … - but not square, funny …
common-sense axioms:
x: human(x) male(x) female(x) x: (male(x) female(x)) (female(x) ) male(x))
x: animal(x) (hasLegs(x) isEven
(numberOfLegs(x)) …
procedural: how to fix/install
/prepare/remove …
epistemic / beliefs: believes (Ptolemy,
shape(Earth, disc)), believes
(Copernicus, shape(Earth, sphere)) …
Slide8Tapping on Wikipedia Categories
Slide9http://www.mpi-inf.mpg.de/yago-naga/KB‘s: Example YAGO (Suchanek et al.: WWW‘07)
Entity
Max_Planck
Apr 23, 1858
Person
City
Country
subclass
Location
subclass
instanceOf
subclass
bornOn
“Max Planck”
means(
0.9)
subclass
Oct 4, 1947
diedOn
Kiel
bornIn
Nobel Prize
Erwin_Planck
FatherOf
hasWon
Scientist
means
“Max Karl Ernst Ludwig Planck”
Physicist
instanceOf
subclass
Biologist
subclass
Germany
Politician
Angela Merkel
Schleswig-Holstein
State
“Angela Dorothea Merkel”
Oct 23, 1944
diedOn
Organization
subclass
Max_Planck Society
instanceOf
means(
0.1)
instanceOf
instanceOf
subclass
subclass
means
“Angela Merkel”
means
citizenOf
instanceOf
instanceOf
locatedIn
locatedIn
subclass
Accuracy
95%
2 Mio.
entities
, 200 000
classes
40
Mio. RDF
triples
(
facts
)
( entity1-relation-entity2,
subject-predicate-object
)
Slide10KB‘s
: Example YAGO (F. Suchanek et al.: WWW‘07)http://www.mpi-inf.mpg.de/yago-naga/
Slide11KB‘s
: Example DBpedia (Auer, Bizer, et al.: ISWC‘07)
3 Mio.
entities
,
1 Bio.
facts
(RDF
triples
)
1.5 Mio.
entities
mapped
to
hand-crafted taxonomy of
259 classes with 1200 properties
http://www.dbpedia.org
Slide12Outline...
Automatic KB ConstructionGrowing &
Maintaining
the
KB
Temporal
Knowledge
What and Why
Wrap-up
Slide13French Marriage Problemfacts in KB:
new facts or fact candidates:
married
(Hillary, Bill)
married
(Carla, Nicolas)
married
(Angelina, Brad)
married
(Cecilia, Nicolas)
married
(Carla, Benjamin)
married
(Carla, Mick
)
married
(Michelle,
Barack
)
married
(Yoko, John)
married
(Kate, Leonardo)
married
(Carla, Sofie)
married
(Larry, Google)
for
recall
:
pattern-based
harvesting
for
precision
:
consistency
reasoning
Slide14Pattern-Based HarvestingFacts
Patterns
(Hillary, Bill)
(Carla, Nicolas)
&
Fact Candidates
X and her husband Y
X and Y on their honeymoon
X and Y and their children
X has been dating with Y
X loves Y
…
good for
recall
noisy, drifting
not robust
enough
for high precision
(Angelina, Brad)
(Hillary, Bill)
(Victoria, David)
(Carla, Nicolas)
(Angelina, Brad)
(Yoko, John)
(Carla, Benjamin)
(Larry, Google)
(Kate, Pete)
(Victoria, David)
(Hearst 92,
Brin
98,
Agichtein
00,
Etzioni
04, …)
Slide15Reasoning
about
Fact
Candidates
Use
consistency
constraints
to
prune
false
candidates
spouse(Hillary,Bill)spouse(
Carla,Nicolas)spouse(Cecilia,Nicolas)
spouse
(
Carla,Ben
)
spouse
(
Carla,Mick
)
spouse
(Carla, Sofie)
spouse
(
x,y
)
diff
(
y,z
)
spouse
(
x,z
)f(Hillary)
f(Carla)f(Cecilia)f(Sofie)m(Bill)
m(Nicolas)m(Ben)m(Mick)
spouse(x,y) f(x)
spouse(x,y) m(y)
spouse(x,y) (f(x)m(y)) (m(x)f(y))
FOL rules
(restricted):
ground atoms:
Rules can be weighted
(e.g. by fraction of
ground atoms that satisfy a
rule) uncertain / probabilistic data
compute prob. distr. of
subset of atoms
being the
truthRules reveal
inconsistenciesFind consistent
subset(s) of atoms(“
possible world(s)“, “the truth“)
spouse(x,y
) diff(w,x)
spouse(w,y)
Slide16Markov
Logic Networks (MLN‘s) (M. Richardson / P. Domingos 2006)
Map
logical
constraints
&
fact
candidates
into
probabilistic graph model
: Markov Random Field
(MRF)
s(x,y) m(y)
s(x,y
) diff(y,z
) s(x,z)
s(Carla,Nicolas)
s(
Cecilia,Nicolas
)
s(
Carla,Ben
)
s(
Carla,Sofie
)
…
s(
x,y
)
diff
(
w,y
) s(
w,y
)
s(
x,y
) f(x)
s(Ca,Nic) s(Ce,Nic
)
s(Ca,Nic) s(Ca,Ben
)
s(Ca,Nic) s(Ca,So)
s(Ca,Ben) s(Ca,So)
s(Ca,Ben
) s(Ca,So)
s(Ca,Nic)
m(Nic)
Grounding:
s(Ce,Nic
) m(Nic)
s(Ca,Ben)
m(Ben)
s(Ca,So) m(So)
f(x)
m(x)m(x) f(x)
Literal
Boolean VarLiteral
binary RV
Slide17Markov
Logic
Networks (
MLN‘s
)
(M. Richardson / P. Domingos 2006)
Map
logical
constraints
&
fact
candidates
into
probabilistic
graph model: Markov
Random Field (MRF)
s(
x,y
)
m(y)
s(
x,y
)
diff
(
y,z
) s(
x,z
)
s(
Carla,Nicolas
)
s(
Cecilia,Nicolas
)
s(
Carla,Ben
)
s(Carla,Sofie)…
s(x,y) diff
(w,y) s(w,y)
s(x,y)
f(x)f(x) m(x)
m(x) f(x)
m
(Ben)
m(Nic)
s(Ca,Nic)
s(Ce,Nic)
s(Ca,Ben)
s(Ca,So)
m
(So)
RVs
coupled
by
MRF
edge
if
they
appear
in same
clause
MRF
assumption
:
P[X
i
|X
1
..
X
n
]=P[
X
i
|N
(
X
i)]Variety of algorithms for joint inference:Gibbs
sampling, other MCMC, belief
propagation, randomized MaxSat, …
joint
distribution has
product form over all cliques
Slide18Related Alternative Probabilistic Models software tools: alchemy.cs.washington.edu
code.google.com/p/factorie/ research.microsoft.com/en-us/um/cambridge
/
projects
/
infernet
/
Constrained
Conditional
Models
[D. Roth et al. 2007]
Factor
Graphs
with
Imperative Variable Coordination
[A. McCallum et al. 2008]
log-linear classifiers
with constraint-violation penalty
mapped into Integer Linear Programs
RV‘s
share
“
factors
“ (
joint
feature
functions
)
generalizes
MRF, BN, CRF, …
inference
via
advanced
MCMC
flexible
coupling
&
constraining of
RV‘s
m(Ben)
m(Nic)
s(Ca,Nic)
s(Ce,Nic)
s(Ca,Ben
) s(
Ca,So)
m
(So)
Reasoning for KB Growth: Direct Route
facts in KB:
new
fact
candidates
:
married
(Hillary, Bill)
married
(Carla, Nicolas)
married
(Angelina, Brad)
married
(Cecilia, Nicolas)
married (Carla, Benjamin)married (Carla, Mick
)married (Carla, Sofie)
married (Larry, Google)
+
patterns
:
X
and
her
husband
Y
X
and
Y
and
their
children
X
has
been
dating
with
Y
X
loves Y
?
facts are true;
fact candidates & patterns
hypotheses grounded
constraints clauses with
hypotheses as vars2.
type signatures of relations greatly reduce
#clauses3. cast into
Weighted Max-Sat with weights
from pattern
stats customized approximation
algorithmunifies: fact cand consistency
, pattern goodness, entity disambig
.(F. Suchanek et al.: WWW‘09)
www.mpi-inf.mpg.de/yago-naga/sofie/
Direct approach:
Slide20Facts & Patterns
Consistency
with
SOFIE
constraints
to
connect
facts
,
fact
candidates, patterns
(F. Suchanek et al.:
WWW’09, N. Nakashole et al.: WebDB‘10)
functional
dependencies:
spouse
(X,Y): X
Y, Y X
relation
properties
:
asymmetry
,
transitivity
,
acyclicity
, …
type
constraints
,
inclusion
dependencies
:
spouse
Person Person
capitalOfCountry
cityOfCountry
domain-specific constraints:
bornInYear(x) + 10years ≤ graduatedInYear
(x)www.mpi-inf.mpg.de/yago-naga/sofie/
hasAdvisor(x,y)
graduatedInYear(x,t)
graduatedInYear(y,s) s < t
pattern-fact
duality:
occurs(p,x,y
) expresses(p,R)
type(x)=dom
(R) type(y)=rng(R)
R(x,y)
name(-in-context)-
to-entity mapping:
means
(n,e1) means(n,e2) …
occurs(p,x,y) R(
x,y) type
(x)=dom(R) type(y)=
rng(R) expresses(
p,R)
Slide21Entity
Disambiguation Revisitedoccurs
(
“
divorced
from
“, Madonna, Guy
Ritchie
)
expresses
(“divorced
from
“, wasMarriedTo)
wasMarriedTo (Madonna, Guy
Ritchie)
actually is:
occurs
(“divorced
from
“,
“
Madonna
“,
“
Guy
Ritchie
“)
means
(
“
Madonna
“,
Madonna Louise
Ciccone
)
expresses
(“divorced
from“, wasMarriedTo
) wasMarriedTo (Madonna Louise
Ciccone, Guy Ritchie) [0.7
]
occurs (“divorced from
“, “Madonna“, “Guy Ritchie“)
means
(“
Madonna“, Madonna (Edvard Munch))
expresses (“
divorced from“,
wasMarriedTo) wasMarriedTo
(Madonna (Edvard Munch), Guy Ritchie)
[0.3]
use context-similarity as disambiguation
prior set clause weights
accordingly
reduced to normal case
entity level
word/
phrase level
Slide22Experimental Results
SOFIE (F. Suchanek et al.: WWW’09) input: biographies of 400 US senators, 3500 HTML files
output
:
birth/death
date&place
,
politicianOf
(state)
run-time: 7 h parsing, 6 h hypotheses
, 2 h Max-Sat
precision: 90-95 % (except for
death place)
recall: ca. 750 extracted facts (300
politicianOf facts)
PROSPERA
(N. Nakashole et al.: WebDB‘10):
input
:
87 000
Wikipedia
articles
and Web
homepages
of
scientists
output
:
hasAdvisor
,
graduatedAt
,
hasCollaborator
,
facultyAt, wonAward
run-time: 1 h total (largely parallelized
) precision: 85-95 %
recall: ca. 4000 extracted facts (400
hasAdvisor facts)
Now running experiments on ClueWeb‘09 corpus
(500 Mio. English Web pages) with Hadoop cluster
of 10x16 cores and 10x48 GB
Slide23Outline...
Automatic KB ConstructionGrowing &
Maintaining
the
KB
Temporal
Knowledge
What and Why
Wrap-up
Slide24Temporal KnowledgeWhich facts
for given relations hold at what time point or
during
which
time
intervals
?
marriedTo
(Madonna, Guy)
[ 22Dec2000, Dec2008 ]
capitalOf (Berlin, Germany)
[ 1990, now ]
capitalOf (Bonn, Germany) [ 1949, 1989 ]hasWonPrize
(JimGray,
TuringAward) [ 1998 ]
graduatedAt (HectorGarcia
-Molina, Stanford) [ 1979 ]graduatedAt (
SusanDavidson
,
Princeton
)
[
Oct
1982 ]
hasAdvisor
(
SusanDavidson
,
HectorGarcia
-Molina)
[
Oct
1982,
forever
]
How
can
we
query &
reason on entity-relationship facts
in a “time-travel“
manner - with uncertain/incomplete KB ?
US president
when Barack
Obama was born?
students of Hector Garcia-Molina while he was at Princeton?
Slide25French Marriage Problem
facts
in
KB
new
fact
candidates
:
married
(Hillary, Bill)
married
(Carla, Nicolas)
married
(Angelina, Brad)
married
(Cecilia, Nicolas)
married
(Carla, Benjamin)
married
(Carla, Mick
)
divorced
(Madonna
, Guy
)
domPartner
(Angelina, Brad
)
1:
2:
3:
validFrom (
2
, 2008)
validFrom
(
4
,
1996)
validUntil
(
4
, 2007)
validFrom
(
5
, 2010
)
validFrom
(
6
, 2006)
validFrom
(
7
, 2008)
4:
5:
6:
7:
8:
JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
Slide26Challenge: Temporal Knowledge
for
all
people
in
Wikipedia
(300 000)
gather
all
spouses
,
incl.
divorced
&
widowed,
and corresponding
time
periods
!
>95%
accuracy
, >95%
coverage
, in
one
night
consistency
constraints
are
potentially
helpful
:
functional
dependencies
:
husband
,
time
wife
inclusion
dependencies
:
marriedPerson
adultPerson
age
/time/
gender
restrictions
:
birthdate
+ <
marriage <
divorce
recall: gather temporal scopes
for base facts
precision: reason on mutual consistency
Slide27Difficult Dating
Slide28(Even More Difficult) Implicit Dating
explicit
dates
vs.
implicit
dates
relative
to
other
dates
Slide29(Even More Difficult) Relative Dating
vague
dates
relative
dates
narrative
text
relative order
Slide30Framework for T-Fact Extraction
(
Theobald
et al.: MUD’10, Wang et al.: EDBT’10; Zhang et al.: WebDB‘08)
represent temporal scopes
of facts
in the presence of incompleteness and uncertainty
2) gather & filter candidates
for t-facts:
extract
base facts
R(e1, e2) first; then
focus on sentences with e1, e2 and
date
or
temporal phrase
3) aggregate & reconcile
evidence from observations
4) reason
on joint constraints about facts and time scopes
Slide311) Representing T-Fact Evidence
different resolutions, later refinement
uncertain
&
inconsistent
evidence
confidence
distribution
After 4 years of happy marriage,
Madonna and Sean got divorced in September 1989.
1:
married(Madonna, Sean)
,
earliestSince
(
1
, 1-Jan-1985),
latestSince
(
1
, 31-Dec-1985),
earliestUntil
(
1
, 1-Sep-1989),
latestUntil
(
1
, 30-Sep-1989)
event
-style
and
state
-style
facts
meta-
facts
to capture temporal scopes
1: married(Madonna, Sean)
, 2: married(Madonna, Guy),
validSince
(1, 16-Aug-1985),
validUntil (
1, 14-Sep-1989),
validSince (
2, 22-Dec-2000), validUntil
(2, 15-Dec-2008)
3: wonAward(Sean,
AcademyAwardForBestActor)
validOn (3
, 29-Feb-2004)
1984
1987
1990
µ
=1987
σ
2
=1
0.7
0.4
0.1
1984
1985
1990
1989
Slide322) Gather & Filter T-Fact Candidates
Choice of sources: news-style biography-style
date in
header
many
dates
in text
relative
temp expr‘s
explicit dates, narrative simple
language elaborated
language many pronouns
pronouns for main
entity
Naive approach:use
deep NLP (dependency parser
) on every sentencethen
use
classifier
(
or
structured-output
learner
) to
detect
t-facts
too
expensive
Bruni
met recently divorced president
Sarkozy
in November 2007 at a dinner party.
She has said she is easily "bored with monogamy“ …
A romance is said to have started a few weeks ago between her and
Biolay
.
2) Gather & Filter: Multi-Stage Approach
stage 1: sentences
with
e1
and
e2
from
R
stage
2:
sentences
that
contain a temporal expression
stage
3: sentences where
the t-expression
refers to R(e1,e2)
match
noun
phrases
against
YAGO
means
relation
use
disambiguation
prior
for
entity
mentions
use
TARSQI tool to extract relative t-expressions and
map them to absolute dates or
durations
run dependency parser: check shortest
path connecting e1, e2, verb, t-expr
alternatively,
consider only sentences with
two noun groups & short
surface distances of e1, e2, t-expr
Jim married
Sue, but later left her and began an affair with Jane
in 2005.
3) Aggregate & Reconcile T-Fact EvidenceIdeal input:Madonna and
Sean were married from 16-Aug-85 until 12-Sep-89.Madonna and
Sean
married
on
August 16, 1985.
Madonna
and
Sean
got
divorced
in September 1989.
time
evidence
Imprecise
input
:
Madonna
and
Sean
were
married
from
1985
through
1989
.
Madonna
and
Sean
were
married
four
years
in
the
late
nineties
.
Madonna
and
Sean
got
divorced
in
fall 1989.
Noisy
input
:
Madonna
and
Sean
plan
their
wedding
in
summer
1985.
Madonna and Sean just returned
from their honeymoon (in Jan 1986).
Madonna and Sean will be
divorced by the
the end of
the
year (1989).
The marriage
of Madonna and Sean
will not survive this
year (1987).
Slide353) Aggregate & Reconcile T-Fact EvidenceReal input:…Madonna
and Sean were chased during their
honeymoon
…
(Jan 19, 1986)
Madonna
and
her
husband
Sean
opened
the
exhibition … (March 7, 1986)
Madonna and her
husband Sean were
seen at
… (April 1, 1986)Madonna and
Sean met other
couples
at
…
(June 22, 1986)
Madonna
and
Sean plan
to
have
children
…
(
July
4, 1986)
Madonna
and
Sean
would
consider
adopting
a child … (July 14, 1986)
Sean and his
wife Madonna purchase
another castle in … (November 5, 1986)
...Madonna and Sean think
about getting
divorced … (April 21, 1989)The
marriage of Madonna and
Sean is in deep
crisis … (May 11, 1989)…
time
evidence
Slide363) Aggregate & Reconcile T-Fact EvidenceReal input:…Madonna
and Sean were chased during their
honeymoon
…
(Jan 19, 1986)
Madonna
and
her
husband
Sean
opened
the
exhibition … (March 7, 1986)
Madonna and her
husband Sean were
seen at
… (April 1, 1986)Madonna and
Sean met other
couples
at
…
(June 22, 1986)
Madonna
and
Sean plan
to
have
children
…
(
July
4, 1986)
Madonna
and
Sean
would
consider
adopting
a child … (July 14, 1986)
Sean and his
wife Madonna purchase
another castle in … (November 5, 1986)
...Madonna and Sean think
about getting
divorced … (April 21, 1989)The
marriage of Madonna and
Sean is in deep
crisis … (May 11, 1989)…
time
evidence
…..……..…
Slide373) Aggregate & Reconcile: Solution
time
evidence
event
histogram
(
begin
)
event
histogram
(end)
state
histogram
(
during
)
Classifer
for
t-
fact
observations
:
begin
vs.
during
vs. end
Build
separate
histogram
for
each
class
(
and
each
t-
fact
)
Combine
histograms
&
derive
high-confidence
time
scope
Slide384) Joint Reasoning on Facts and T-Facts
X, Y, Z, T1, T2: m(X,Y) m(X,Z) validTime(m(X,Y),T1) validTime(m(X,Z),
T2)
overlaps
(T1, T2)
constraint
:
marriedTo
(m)
is
an
injective function
at any
given point
Combine & reconcile t-scopes
across different facts
after
grounding:
m(Carla, Nicolas) m(Cecilia, Nicolas)
overlaps
([2008,2010], [1996,2007])
m(Carla, Nicolas)
m(Carla, Benjamin)
overlaps
([2008,2010], [2009,2011])
m(
Ca,Nic
)
m
(
Ce,Nic
)
false
m(
Ca,Nic
) m(Ca,Ben)
true
Slide394) Joint Reasoning on Facts and T-Facts
time
m(
Ca
, Ben)
m(
Ca
,
Nic
)
m(
Ce
,
Nic
)
m(
Ca
, Mi)
m(
Ce
, Mi)
Conflict
graph
:
m(
Ca
, Ben)
[2009,2011]
m(
Ca
,
Nic
)
[2008,2010]
m(
Ce
,
Nic
)
[1996,2007]
m(
Ca
, Mi)
[2004,2008]
m(
Ce
, Mi)
[1998,2005]
Find
maximal
independent
set
:
subset
of
nodes
w/o
adjacent
pairs
with
(
evidence
-)
weighted
nodes
Slide404) Joint Reasoning on Facts and T-Facts
time
m(
Ca
, Ben)
m(
Ca
,
Nic
)
m(
Ce
,
Nic
)
m(
Ca
, Mi)
m(
Ce
, Mi)
Conflict
graph
:
m(
Ca
, Ben)
[2009,2011]
m(
Ca
,
Nic
)
[2008,2010]
m(
Ce
,
Nic
)
[1996,2007]
m(
Ca
, Mi)
[2004,2008]
m(
Ce
, Mi)
[1998,2005]
Find
maximal
independent
set
:
subset
of
nodes
w/o
adjacent
pairs
with
(
evidence
-)
weighted
nodes
100
20
80
30
10
Slide414) Joint Reasoning on Facts and T-Facts
time
m(
Ca
, Ben)
m(
Ca
,
Nic
)
m(
Ce
,
Nic
)
m(
Ca
, Mi)
m(
Ce
, Mi)
alternative
approach
:
split
t-
scopes
and
reason
on
consistency
of
t-
fact
partitions
Slide42Preliminary ResultsplaysForTeam(X,Z)@T1 playsForTeam
(Y,Z)@T2 overlaps (T1,T2) teammates(X,Y)
automatic
extraction
of
t-
facts
about football/
soccer from Wikipedia
and news
articles
query answering by
reasoning on t-facts
Slide43Outline...Automatic KB
ConstructionGrowing & Maintaining the KB
Temporal
Knowledge
What and Why
Wrap-up
Slide44KB Building: Where Do We Stand?
Knowledge
Bases
on
Entities
&
Classes
Relationships
Temporal
Knowledge
widely
open
(fertile)
research
ground
:
uncertain
/
incomplete
temporal
scopes
of
facts
joint
reasoning
on
base-facts
and
time-scopes
good
progress
, but
many
challenges left:
recall & precision by
patterns & reasoning
efficiency & scalability
soft rules, hard constraints
, richer logics, … open-
domain discovery of new
relation types
strong success
story, some problems left
: large taxonomies of
classes with individual entities
long tail
calls for new methods
entity disambiguation
remains grand challenge
Slide45Overall Take-Home
...
Historic
opportunity
:
revive
Cyc
vision
,
make
it
real &
large-
scale ! KB
as enabler of macroscopic
„
machine
reading
“
challenging
&
risky
, but
high
pay
-off
Explore
&
exploit
synergies
between
semantic, statistical, &
social Web methods:statistical
evidence + logical
consistency !Many
interesting research topics for
CS (+ CoLi): efficiency
& scalability constraints
& reasoning on uncertain
data NLP for temporal
statements statistical
ranking for semantic search
knowledge-base life-cycle: growth
& maintenance
Slide46Thank You !