/
Gerhard  Weikum   Max Planck Institute Gerhard  Weikum   Max Planck Institute

Gerhard Weikum Max Planck Institute - PowerPoint Presentation

sequest
sequest . @sequest
Follow
353 views
Uploaded On 2020-06-25

Gerhard Weikum Max Planck Institute - PPT Presentation

for Informatics httpwwwmpiinfmpgdeweikum From Information to Knowledge Harvesting Entities Relationships and Temporal Facts from Web Sources Acknowledgements ID: 787640

amp madonna married facts madonna amp facts married sean carla nic fact ben time nicolas 1986 divorced knowledge spouse

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Gerhard Weikum Max Planck Institute" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Gerhard Weikum Max Planck Institute for Informaticshttp://www.mpi-inf.mpg.de/~weikum/

From Information to Knowledge:

Harvesting

Entities

,

Relationships

,

and

Temporal Facts

from

Web

Sources

Slide2

Acknowledgements

Slide3

Goal: Turn Web into Knowledge Base

comprehensive DB of human knowledge everything that Wikipedia

knows

everything

machine-readable

capturing

entities

,

classes, relationships

Source:

DB & IR methods for

knowledge discovery.Communications of

the ACM 52(4), 2009

Slide4

Approach: Harvesting Facts from Web

Politician

Political Party

Angela Merkel CDU

Karl-Theodor zu

Guttenberg

CDU

Christoph Hartmann FDP

Company

CEO

Google Eric Schmidt

Yahoo Overture

Facebook FriendFeed

Software AG IDS Scheer

Movie

ReportedRevenue

Avatar $

2,718,444,933

The Reader $

108,709,522

Facebook FriendFeed

Software AG IDS Scheer

PoliticalParty

Spokesperson

CDU

Philipp

Wachholz

Die Grünen Claudia Roth

Facebook

FriendFeed

Software AG IDS Scheer

Actor

Award

Christoph Waltz Oscar

Sandra Bullock Oscar

Sandra Bullock Golden

Raspberry

Politician

Position

Angela Merkel

Chancellor

Germany

Karl-Theodor zu

Guttenberg

Minister

of

Defense Germany

Christoph Hartmann Minister

of

Economy Saarland

Company

AcquiredCompany

Google

YouTube

Yahoo

Overture

Facebook

FriendFeed

Software AG IDS Scheer

YAGO-NAGA

IWP

Cyc

TextRunner

ReadTheWeb

WikiTax2WordNet

SUMO

Slide5

Knowledge for Intelligence

entity recognition & disambiguation understanding natural language & speech knowledge services & reasoning for semantic apps (e.g. deep QA) semantic search:

precise

answers

to advanced queries

(by scientists, students, journalists, analysts, etc.)

FIFA 2010

finalists

who

played

in a Champions League final?

Politicians who

are also scientists?

Enzymes that

inhibit HIV? Influenza drugs for

teens with high blood pressure

?

...

German

football

coach

when

Bastian Schweinsteiger

was

born

?

Relationships

between

Manfred

Pinkal

,

Edsger

Dijkstra, Michael Dell,

and

Renee Zellweger?

Slide6

Outline...

Automatic KB Construction

Growing

&

Maintaining

the

KB

Temporal

Knowledge

What

and

Why

Wrap-up

Slide7

What is Knowledge (in a KB)?

...

facts

/

assertions

:

bornIn

(BastianSchweinsteiger,

Kolbermoor

),

hasWon (BastianSchweinsteiger, BronzeFIFAWorldCup2010), playedInFinal (BastianSchweinsteiger, ChampionsLeague2010), …

taxonomic: instanceOf (BastianSchweinsteiger, footballPlayer

), subclassOf (footballPlayer

, athlete), …

lexical / terminology:

means (“Big Apple“, NewYorkCity),

means (“Apple“, AppleComputerCorporation) means (“MS“, Microsoft) ,

means

(“MS“,

MultipleSclerosis

) …

common

-sense

properties

:

apples

are

green

,

red

,

juicy

,

sweet, sour

… - but not fast, smart … balls are round

, smooth, slippery … - but not square, funny …

common-sense axioms:

 x: human(x)  male(x)  female(x)  x: (male(x)   female(x))  (female(x) )   male(x))

 x: animal(x)  (hasLegs(x)  isEven

(numberOfLegs(x)) …

procedural: how to fix/install

/prepare/remove …

epistemic / beliefs: believes (Ptolemy,

shape(Earth, disc)), believes

(Copernicus, shape(Earth, sphere)) …

Slide8

Tapping on Wikipedia Categories

Slide9

http://www.mpi-inf.mpg.de/yago-naga/KB‘s: Example YAGO (Suchanek et al.: WWW‘07)

Entity

Max_Planck

Apr 23, 1858

Person

City

Country

subclass

Location

subclass

instanceOf

subclass

bornOn

“Max Planck”

means(

0.9)

subclass

Oct 4, 1947

diedOn

Kiel

bornIn

Nobel Prize

Erwin_Planck

FatherOf

hasWon

Scientist

means

“Max Karl Ernst Ludwig Planck”

Physicist

instanceOf

subclass

Biologist

subclass

Germany

Politician

Angela Merkel

Schleswig-Holstein

State

“Angela Dorothea Merkel”

Oct 23, 1944

diedOn

Organization

subclass

Max_Planck Society

instanceOf

means(

0.1)

instanceOf

instanceOf

subclass

subclass

means

“Angela Merkel”

means

citizenOf

instanceOf

instanceOf

locatedIn

locatedIn

subclass

Accuracy

95%

2 Mio.

entities

, 200 000

classes

40

Mio. RDF

triples

(

facts

)

( entity1-relation-entity2,

subject-predicate-object

)

Slide10

KB‘s

: Example YAGO (F. Suchanek et al.: WWW‘07)http://www.mpi-inf.mpg.de/yago-naga/

Slide11

KB‘s

: Example DBpedia (Auer, Bizer, et al.: ISWC‘07)

3 Mio.

entities

,

1 Bio.

facts

(RDF

triples

)

1.5 Mio.

entities

mapped

to

hand-crafted taxonomy of

259 classes with 1200 properties

http://www.dbpedia.org

Slide12

Outline...

Automatic KB ConstructionGrowing &

Maintaining

the

KB

Temporal

Knowledge

What and Why

Wrap-up

Slide13

French Marriage Problemfacts in KB:

new facts or fact candidates:

married

(Hillary, Bill)

married

(Carla, Nicolas)

married

(Angelina, Brad)

married

(Cecilia, Nicolas)

married

(Carla, Benjamin)

married

(Carla, Mick

)

married

(Michelle,

Barack

)

married

(Yoko, John)

married

(Kate, Leonardo)

married

(Carla, Sofie)

married

(Larry, Google)

for

recall

:

pattern-based

harvesting

for

precision

:

consistency

reasoning

Slide14

Pattern-Based HarvestingFacts

Patterns

(Hillary, Bill)

(Carla, Nicolas)

&

Fact Candidates

X and her husband Y

X and Y on their honeymoon

X and Y and their children

X has been dating with Y

X loves Y

good for

recall

noisy, drifting

not robust

enough

for high precision

(Angelina, Brad)

(Hillary, Bill)

(Victoria, David)

(Carla, Nicolas)

(Angelina, Brad)

(Yoko, John)

(Carla, Benjamin)

(Larry, Google)

(Kate, Pete)

(Victoria, David)

(Hearst 92,

Brin

98,

Agichtein

00,

Etzioni

04, …)

Slide15

Reasoning

about

Fact

Candidates

Use

consistency

constraints

to

prune

false

candidates

spouse(Hillary,Bill)spouse(

Carla,Nicolas)spouse(Cecilia,Nicolas)

spouse

(

Carla,Ben

)

spouse

(

Carla,Mick

)

spouse

(Carla, Sofie)

spouse

(

x,y

)

diff

(

y,z

)  

spouse

(

x,z

)f(Hillary)

f(Carla)f(Cecilia)f(Sofie)m(Bill)

m(Nicolas)m(Ben)m(Mick)

spouse(x,y)  f(x)

spouse(x,y)  m(y)

spouse(x,y)  (f(x)m(y))  (m(x)f(y))

FOL rules

(restricted):

ground atoms:

Rules can be weighted

(e.g. by fraction of

ground atoms that satisfy a

rule) uncertain / probabilistic data

compute prob. distr. of

subset of atoms

being the

truthRules reveal

inconsistenciesFind consistent

subset(s) of atoms(“

possible world(s)“, “the truth“)

spouse(x,y

)  diff(w,x) 

spouse(w,y)

Slide16

Markov

Logic Networks (MLN‘s) (M. Richardson / P. Domingos 2006)

Map

logical

constraints

&

fact

candidates

into

probabilistic graph model

: Markov Random Field

(MRF)

s(x,y)  m(y)

s(x,y

)  diff(y,z

)  s(x,z)

s(Carla,Nicolas)

s(

Cecilia,Nicolas

)

s(

Carla,Ben

)

s(

Carla,Sofie

)

s(

x,y

)

diff

(

w,y

)  s(

w,y

)

s(

x,y

)  f(x)

s(Ca,Nic)  s(Ce,Nic

)

s(Ca,Nic)  s(Ca,Ben

) 

s(Ca,Nic)  s(Ca,So)

s(Ca,Ben)  s(Ca,So)

s(Ca,Ben

)  s(Ca,So)

s(Ca,Nic)

 m(Nic)

Grounding:

s(Ce,Nic

)  m(Nic)

s(Ca,Ben)

 m(Ben)

s(Ca,So)  m(So)

f(x)

 m(x)m(x)  f(x)

Literal

 Boolean VarLiteral

 binary RV

Slide17

Markov

Logic

Networks (

MLN‘s

)

(M. Richardson / P. Domingos 2006)

Map

logical

constraints

&

fact

candidates

into

probabilistic

graph model: Markov

Random Field (MRF)

s(

x,y

)

 m(y)

s(

x,y

)

diff

(

y,z

)  s(

x,z

)

s(

Carla,Nicolas

)

s(

Cecilia,Nicolas

)

s(

Carla,Ben

)

s(Carla,Sofie)…

s(x,y)  diff

(w,y)  s(w,y)

s(x,y)

 f(x)f(x)  m(x)

m(x)  f(x)

m

(Ben)

m(Nic)

s(Ca,Nic)

s(Ce,Nic)

s(Ca,Ben)

s(Ca,So)

m

(So)

RVs

coupled

by

MRF

edge

if

they

appear

in same

clause

MRF

assumption

:

P[X

i

|X

1

..

X

n

]=P[

X

i

|N

(

X

i)]Variety of algorithms for joint inference:Gibbs

sampling, other MCMC, belief

propagation, randomized MaxSat, …

joint

distribution has

product form over all cliques

Slide18

Related Alternative Probabilistic Models software tools: alchemy.cs.washington.edu

code.google.com/p/factorie/ research.microsoft.com/en-us/um/cambridge

/

projects

/

infernet

/

Constrained

Conditional

Models

[D. Roth et al. 2007]

Factor

Graphs

with

Imperative Variable Coordination

[A. McCallum et al. 2008]

log-linear classifiers

with constraint-violation penalty

mapped into Integer Linear Programs

RV‘s

share

factors

“ (

joint

feature

functions

)

generalizes

MRF, BN, CRF, …

inference

via

advanced

MCMC

flexible

coupling

&

constraining of

RV‘s

m(Ben)

m(Nic)

s(Ca,Nic)

s(Ce,Nic)

s(Ca,Ben

) s(

Ca,So)

m

(So)

Slide19

Reasoning for KB Growth: Direct Route

facts in KB:

new

fact

candidates

:

married

(Hillary, Bill)

married

(Carla, Nicolas)

married

(Angelina, Brad)

married

(Cecilia, Nicolas)

married (Carla, Benjamin)married (Carla, Mick

)married (Carla, Sofie)

married (Larry, Google)

+

patterns

:

X

and

her

husband

Y

X

and

Y

and

their

children

X

has

been

dating

with

Y

X

loves Y

?

facts are true;

fact candidates & patterns

 hypotheses grounded

constraints  clauses with

hypotheses as vars2.

type signatures of relations greatly reduce

#clauses3. cast into

Weighted Max-Sat with weights

from pattern

stats customized approximation

algorithmunifies: fact cand consistency

, pattern goodness, entity disambig

.(F. Suchanek et al.: WWW‘09)

www.mpi-inf.mpg.de/yago-naga/sofie/

Direct approach:

Slide20

Facts & Patterns

Consistency

with

SOFIE

constraints

to

connect

facts

,

fact

candidates, patterns

(F. Suchanek et al.:

WWW’09, N. Nakashole et al.: WebDB‘10)

functional

dependencies:

spouse

(X,Y): X

 Y, Y X

relation

properties

:

asymmetry

,

transitivity

,

acyclicity

, …

type

constraints

,

inclusion

dependencies

:

spouse

 Person  Person

capitalOfCountry

cityOfCountry

domain-specific constraints:

bornInYear(x) + 10years ≤ graduatedInYear

(x)www.mpi-inf.mpg.de/yago-naga/sofie/

hasAdvisor(x,y)

 graduatedInYear(x,t) 

graduatedInYear(y,s)  s < t

pattern-fact

duality:

occurs(p,x,y

)  expresses(p,R)

 type(x)=dom

(R)  type(y)=rng(R)

 R(x,y)

name(-in-context)-

to-entity mapping:

 means

(n,e1)   means(n,e2)  …

occurs(p,x,y)  R(

x,y)  type

(x)=dom(R)  type(y)=

rng(R)  expresses(

p,R)

Slide21

Entity

Disambiguation Revisitedoccurs

(

divorced

from

“, Madonna, Guy

Ritchie

)

expresses

(“divorced

from

“, wasMarriedTo)

 wasMarriedTo (Madonna, Guy

Ritchie)

actually is:

occurs

(“divorced

from

“,

Madonna

“,

Guy

Ritchie

“)

means

(

Madonna

“,

Madonna Louise

Ciccone

)

expresses

(“divorced

from“, wasMarriedTo

)  wasMarriedTo (Madonna Louise

Ciccone, Guy Ritchie) [0.7

]

occurs (“divorced from

“, “Madonna“, “Guy Ritchie“)

 means

(“

Madonna“, Madonna (Edvard Munch))

 expresses (“

divorced from“,

wasMarriedTo)  wasMarriedTo

(Madonna (Edvard Munch), Guy Ritchie)

[0.3]

use context-similarity as disambiguation

prior set clause weights

accordingly 

reduced to normal case

entity level

word/

phrase level

Slide22

Experimental Results

SOFIE (F. Suchanek et al.: WWW’09) input: biographies of 400 US senators, 3500 HTML files

output

:

birth/death

date&place

,

politicianOf

(state)

run-time: 7 h parsing, 6 h hypotheses

, 2 h Max-Sat

precision: 90-95 % (except for

death place)

recall: ca. 750 extracted facts (300

politicianOf facts)

PROSPERA

(N. Nakashole et al.: WebDB‘10):

input

:

87 000

Wikipedia

articles

and Web

homepages

of

scientists

output

:

hasAdvisor

,

graduatedAt

,

hasCollaborator

,

facultyAt, wonAward

run-time: 1 h total (largely parallelized

) precision: 85-95 %

recall: ca. 4000 extracted facts (400

hasAdvisor facts)

Now running experiments on ClueWeb‘09 corpus

(500 Mio. English Web pages) with Hadoop cluster

of 10x16 cores and 10x48 GB

Slide23

Outline...

Automatic KB ConstructionGrowing &

Maintaining

the

KB

Temporal

Knowledge

What and Why

Wrap-up

Slide24

Temporal KnowledgeWhich facts

for given relations hold at what time point or

during

which

time

intervals

?

marriedTo

(Madonna, Guy)

[ 22Dec2000, Dec2008 ]

capitalOf (Berlin, Germany)

[ 1990, now ]

capitalOf (Bonn, Germany) [ 1949, 1989 ]hasWonPrize

(JimGray,

TuringAward) [ 1998 ]

graduatedAt (HectorGarcia

-Molina, Stanford) [ 1979 ]graduatedAt (

SusanDavidson

,

Princeton

)

[

Oct

1982 ]

hasAdvisor

(

SusanDavidson

,

HectorGarcia

-Molina)

[

Oct

1982,

forever

]

How

can

we

query &

reason on entity-relationship facts

in a “time-travel“

manner - with uncertain/incomplete KB ?

US president

when Barack

Obama was born?

students of Hector Garcia-Molina while he was at Princeton?

Slide25

French Marriage Problem

facts

in

KB

new

fact

candidates

:

married

(Hillary, Bill)

married

(Carla, Nicolas)

married

(Angelina, Brad)

married

(Cecilia, Nicolas)

married

(Carla, Benjamin)

married

(Carla, Mick

)

divorced

(Madonna

, Guy

)

domPartner

(Angelina, Brad

)

1:

2:

3:

validFrom (

2

, 2008)

validFrom

(

4

,

1996)

validUntil

(

4

, 2007)

validFrom

(

5

, 2010

)

validFrom

(

6

, 2006)

validFrom

(

7

, 2008)

4:

5:

6:

7:

8:

JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC

Slide26

Challenge: Temporal Knowledge

for

all

people

in

Wikipedia

(300 000)

gather

all

spouses

,

incl.

divorced

&

widowed,

and corresponding

time

periods

!

>95%

accuracy

, >95%

coverage

, in

one

night

consistency

constraints

are

potentially

helpful

:

functional

dependencies

:

husband

,

time

wife

inclusion

dependencies

:

marriedPerson



adultPerson

age

/time/

gender

restrictions

:

birthdate

+  <

marriage <

divorce

recall: gather temporal scopes

for base facts

precision: reason on mutual consistency

Slide27

Difficult Dating

Slide28

(Even More Difficult) Implicit Dating

explicit

dates

vs.

implicit

dates

relative

to

other

dates

Slide29

(Even More Difficult) Relative Dating

vague

dates

relative

dates

narrative

text

relative order

Slide30

Framework for T-Fact Extraction

(

Theobald

et al.: MUD’10, Wang et al.: EDBT’10; Zhang et al.: WebDB‘08)

represent temporal scopes

of facts

in the presence of incompleteness and uncertainty

2) gather & filter candidates

for t-facts:

extract

base facts

R(e1, e2) first; then

focus on sentences with e1, e2 and

date

or

temporal phrase

3) aggregate & reconcile

evidence from observations

4) reason

on joint constraints about facts and time scopes

Slide31

1) Representing T-Fact Evidence

different resolutions, later refinement

uncertain

&

inconsistent

evidence

confidence

distribution

After 4 years of happy marriage,

Madonna and Sean got divorced in September 1989.

1:

married(Madonna, Sean)

,

earliestSince

(

1

, 1-Jan-1985),

latestSince

(

1

, 31-Dec-1985),

earliestUntil

(

1

, 1-Sep-1989),

latestUntil

(

1

, 30-Sep-1989)

event

-style

and

state

-style

facts

meta-

facts

to capture temporal scopes

1: married(Madonna, Sean)

, 2: married(Madonna, Guy),

validSince

(1, 16-Aug-1985),

validUntil (

1, 14-Sep-1989),

validSince (

2, 22-Dec-2000), validUntil

(2, 15-Dec-2008)

3: wonAward(Sean,

AcademyAwardForBestActor)

validOn (3

, 29-Feb-2004)

1984

1987

1990

µ

=1987

σ

2

=1

0.7

0.4

0.1

1984

1985

1990

1989

Slide32

2) Gather & Filter T-Fact Candidates

Choice of sources: news-style biography-style

date in

header

many

dates

in text

relative

temp expr‘s

explicit dates, narrative simple

language elaborated

language many pronouns

pronouns for main

entity

Naive approach:use

deep NLP (dependency parser

) on every sentencethen

use

classifier

(

or

structured-output

learner

) to

detect

t-facts

too

expensive

Bruni

met recently divorced president

Sarkozy

in November 2007 at a dinner party.

She has said she is easily "bored with monogamy“ …

A romance is said to have started a few weeks ago between her and

Biolay

.

Slide33

2) Gather & Filter: Multi-Stage Approach

stage 1: sentences

with

e1

and

e2

from

R

stage

2:

sentences

that

contain a temporal expression

stage

3: sentences where

the t-expression

refers to R(e1,e2)

match

noun

phrases

against

YAGO

means

relation

use

disambiguation

prior

for

entity

mentions

use

TARSQI tool to extract relative t-expressions and

map them to absolute dates or

durations

run dependency parser: check shortest

path connecting e1, e2, verb, t-expr

alternatively,

consider only sentences with

two noun groups & short

surface distances of e1, e2, t-expr

Jim married

Sue, but later left her and began an affair with Jane

in 2005.

Slide34

3) Aggregate & Reconcile T-Fact EvidenceIdeal input:Madonna and

Sean were married from 16-Aug-85 until 12-Sep-89.Madonna and

Sean

married

on

August 16, 1985.

Madonna

and

Sean

got

divorced

in September 1989.

time

evidence

Imprecise

input

:

Madonna

and

Sean

were

married

from

1985

through

1989

.

Madonna

and

Sean

were

married

four

years

in

the

late

nineties

.

Madonna

and

Sean

got

divorced

in

fall 1989.

Noisy

input

:

Madonna

and

Sean

plan

their

wedding

in

summer

1985.

Madonna and Sean just returned

from their honeymoon (in Jan 1986).

Madonna and Sean will be

divorced by the

the end of

the

year (1989).

The marriage

of Madonna and Sean

will not survive this

year (1987).

Slide35

3) Aggregate & Reconcile T-Fact EvidenceReal input:…Madonna

and Sean were chased during their

honeymoon

(Jan 19, 1986)

Madonna

and

her

husband

Sean

opened

the

exhibition … (March 7, 1986)

Madonna and her

husband Sean were

seen at

… (April 1, 1986)Madonna and

Sean met other

couples

at

(June 22, 1986)

Madonna

and

Sean plan

to

have

children

(

July

4, 1986)

Madonna

and

Sean

would

consider

adopting

a child … (July 14, 1986)

Sean and his

wife Madonna purchase

another castle in … (November 5, 1986)

...Madonna and Sean think

about getting

divorced … (April 21, 1989)The

marriage of Madonna and

Sean is in deep

crisis … (May 11, 1989)…

time

evidence

Slide36

3) Aggregate & Reconcile T-Fact EvidenceReal input:…Madonna

and Sean were chased during their

honeymoon

(Jan 19, 1986)

Madonna

and

her

husband

Sean

opened

the

exhibition … (March 7, 1986)

Madonna and her

husband Sean were

seen at

… (April 1, 1986)Madonna and

Sean met other

couples

at

(June 22, 1986)

Madonna

and

Sean plan

to

have

children

(

July

4, 1986)

Madonna

and

Sean

would

consider

adopting

a child … (July 14, 1986)

Sean and his

wife Madonna purchase

another castle in … (November 5, 1986)

...Madonna and Sean think

about getting

divorced … (April 21, 1989)The

marriage of Madonna and

Sean is in deep

crisis … (May 11, 1989)…

time

evidence

…..……..…

Slide37

3) Aggregate & Reconcile: Solution

time

evidence

event

histogram

(

begin

)

event

histogram

(end)

state

histogram

(

during

)

Classifer

for

t-

fact

observations

:

begin

vs.

during

vs. end

Build

separate

histogram

for

each

class

(

and

each

t-

fact

)

Combine

histograms

&

derive

high-confidence

time

scope

Slide38

4) Joint Reasoning on Facts and T-Facts

 X, Y, Z, T1, T2: m(X,Y)  m(X,Z)  validTime(m(X,Y),T1)  validTime(m(X,Z),

T2)

 

overlaps

(T1, T2)

constraint

:

marriedTo

(m)

is

an

injective function

at any

given point

Combine & reconcile t-scopes

across different facts

after

grounding:

m(Carla, Nicolas)  m(Cecilia, Nicolas)

 

overlaps

([2008,2010], [1996,2007])

m(Carla, Nicolas)

 m(Carla, Benjamin)

 

overlaps

([2008,2010], [2009,2011])

m(

Ca,Nic

)

m

(

Ce,Nic

)

false

m(

Ca,Nic

) m(Ca,Ben)

true

Slide39

4) Joint Reasoning on Facts and T-Facts

time

m(

Ca

, Ben)

m(

Ca

,

Nic

)

m(

Ce

,

Nic

)

m(

Ca

, Mi)

m(

Ce

, Mi)

Conflict

graph

:

m(

Ca

, Ben)

[2009,2011]

m(

Ca

,

Nic

)

[2008,2010]

m(

Ce

,

Nic

)

[1996,2007]

m(

Ca

, Mi)

[2004,2008]

m(

Ce

, Mi)

[1998,2005]

Find

maximal

independent

set

:

subset

of

nodes

w/o

adjacent

pairs

with

(

evidence

-)

weighted

nodes

Slide40

4) Joint Reasoning on Facts and T-Facts

time

m(

Ca

, Ben)

m(

Ca

,

Nic

)

m(

Ce

,

Nic

)

m(

Ca

, Mi)

m(

Ce

, Mi)

Conflict

graph

:

m(

Ca

, Ben)

[2009,2011]

m(

Ca

,

Nic

)

[2008,2010]

m(

Ce

,

Nic

)

[1996,2007]

m(

Ca

, Mi)

[2004,2008]

m(

Ce

, Mi)

[1998,2005]

Find

maximal

independent

set

:

subset

of

nodes

w/o

adjacent

pairs

with

(

evidence

-)

weighted

nodes

100

20

80

30

10

Slide41

4) Joint Reasoning on Facts and T-Facts

time

m(

Ca

, Ben)

m(

Ca

,

Nic

)

m(

Ce

,

Nic

)

m(

Ca

, Mi)

m(

Ce

, Mi)

alternative

approach

:

split

t-

scopes

and

reason

on

consistency

of

t-

fact

partitions

Slide42

Preliminary ResultsplaysForTeam(X,Z)@T1  playsForTeam

(Y,Z)@T2  overlaps (T1,T2)  teammates(X,Y)

automatic

extraction

of

t-

facts

about football/

soccer from Wikipedia

and news

articles

query answering by

reasoning on t-facts

Slide43

Outline...Automatic KB

ConstructionGrowing & Maintaining the KB

Temporal

Knowledge

What and Why

Wrap-up

Slide44

KB Building: Where Do We Stand?

Knowledge

Bases

on

Entities

&

Classes

Relationships

Temporal

Knowledge

widely

open

(fertile)

research

ground

:

uncertain

/

incomplete

temporal

scopes

of

facts

joint

reasoning

on

base-facts

and

time-scopes

good

progress

, but

many

challenges left:

recall & precision by

patterns & reasoning

efficiency & scalability

soft rules, hard constraints

, richer logics, … open-

domain discovery of new

relation types

strong success

story, some problems left

: large taxonomies of

classes with individual entities

long tail

calls for new methods

entity disambiguation

remains grand challenge

Slide45

Overall Take-Home

...

Historic

opportunity

:

revive

Cyc

vision

,

make

it

real &

large-

scale ! KB

as enabler of macroscopic

machine

reading

challenging

&

risky

, but

high

pay

-off

Explore

&

exploit

synergies

between

semantic, statistical, &

social Web methods:statistical

evidence + logical

consistency !Many

interesting research topics for

CS (+ CoLi): efficiency

& scalability constraints

& reasoning on uncertain

data NLP for temporal

statements statistical

ranking for semantic search

knowledge-base life-cycle: growth

& maintenance

Slide46

Thank You !