/
Gerhard Weikum Gerhard Weikum

Gerhard Weikum - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
379 views
Uploaded On 2017-07-07

Gerhard Weikum - PPT Presentation

Max Planck Institute for Informatics amp Saarland University httpwwwmpiinfmpgdeweikum Semantic Search from Names and Phrases to Entities and ID: 567520

entity rome semantic ecstasy rome entity ecstasy semantic western disambiguation search composed type ennio scores http westerns yago trilogy graph morricone eli

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Gerhard Weikum" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Gerhard Weikum Max Planck Institute for Informatics & Saarland Universityhttp://www.mpi-inf.mpg.de/~weikum/

Semantic

Search

:

from

Names

and

Phrases

to

Entities

and

RelationsSlide2

AcknowledgementsSlide3

Big Picture: Opportunities Now !

KB Population

Info

Extraction

Semantic

Authoring

Entity

Linkage

Web

of

Data

Web

of

Users & Contents

Very

Large

Knowledge

Bases

Semantic

Docs

DisambiguationSlide4

Big Picture: Opportunities Now !

KB Population

Info

Extraction

Semantic

Authoring

Entity

Linkage

Web

of

Data

Web

of

Users & Contents

Very

Large

Knowledge

Bases

Semantic

Docs

Disambiguation

This

talk:

How

Do

We

Search

this

World

of

Knowledge

, Data,

and

Text

(

and

cope

with

ambiguity

)

for

Knowledge

Harvesting

see

talks

at

College de France

and

at

VLDB School in KunmingSlide5

http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.png

Web of Data: RDF, Tables,

Microdata

YAGO

Cyc

TextRunner

/

ReVerb

WikiTaxonomy

/

WikiNet

SUMO

ConceptNet

5

BabelNet

ReadTheWeb

30

Bio

. SPO

triples

(RDF)

and

growingSlide6

http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.png

Web of Data: RDF, Tables,

Microdata

YAGO

30

Bio

. SPO

triples

(RDF)

and

growing

10M

entities

in

350K classes

120M facts for

100 relations 100

languages 95% accuracy

4M

entities in 250 classes

500M facts for 6000

properties live updates

25M entities

in

2000

topics

100M

facts

for

4000

properties

powers

Google

knowledge

graph

Ennio_Morricone

type

composer

Ennio_Morricone type GrammyAwardWinner

composer

subclassOf musician

Ennio_Morricone bornIn

Rome

Rome locatedIn

Italy

Ennio_Morricone created

Ecstasy_of_GoldEnnio_Morricone

wroteMusicFor

The_Good,_the_Bad_,and_the_Ugly

Sergio_Leone directed The_Good,_the_Bad_,and_the_Ugly

Slide7

owl:sameAs

rdf.freebase.com/ns/

en.rome

owl:sameAs

owl:sameAs

data.nytimes.com/

51688803696189142301

Coord

geonames.org/

3169070/

roma

N 41° 54' 10'' E 12° 29' 2

''

dbpprop:citizenOf

dbpedia.org/

resource

/

Rome

rdf:type

rdfs:subclassOf

yago

/

wordnet:Actor109765278

rdf:type

rdfs:subclassOf

yago

/

wikicategory:ItalianComposer

yago

/

wordnet

: Artist109812338

prop:actedIn

imdb.com/name/nm0910607/

Linked

RDF Triples on the Web

prop

:

composedMusicFor

imdb.com/title/

tt0361748

/

dbpedia.org/

resource

/Ennio_Morricone

500 Mio. linksSlide8

Embedding (RDF) Microdata in HTML Pages

May 2, 2011

Maestro

Morricone

will perform

on

the

stage

of the

Smetana

Hall

to

conduct the

Czech

National

Symphony

Orchestra and Choir.

The

concert

will

feature

both

Classical

compositions

and

soundtracks

such

as

t

he

Ecstasy

of

Gold.

In

programme

two concerts for

July

14th and

15th.

<html … May 2, 2011

<div

typeof

=

event:music

>

<span

id

="

Maestro_Morricone

">

Maestro

Morricone

<a

rel="sameAs"

resource="dbpedia

/Ennio_Morricone "/></span>

…<span property = "event:location"

>Smetana Hall </span>…

<span property="rdf:type"

resource="

yago:performance">The concert </span> will

feature …<span property="

event:date" content=

"14-07-2011"></span>July 1

</div>

Supported

by

RDFa

and

microformats like schema.orgSlide9

Outline

Opportunities

Now

Entity

Name

Disambiguation

Question

Answering

Disambiguation

Reloaded

Wrap-Up

Semantic

Search Today

Slide10

Semantic Search Today (1)Slide11

Semantic Search Today (1)Slide12

Semantic Search Today (1)Slide13

Semantic Search Today (1)Slide14

Semantic Search Today (1)Slide15

Semantic Search Today (2)

Select ?x

Where

{

?x type

composer

[western

movie

]

.

?x

wasBornIn

?y . ?y

locatedIn

Europe . } Slide16

Semantic Search Today (2)

Select ?x

Where

{

?x type

composer

.

?x

participatedIn

?y . ?y type

western_film

. } Slide17

Semantic Search Today (3)Slide18

Semantic Search Today (3)Slide19

Semantic Search Today (3)Slide20

Semantic Search Today (4)Slide21

Semantic Search Today (4)

Key

problem

in

semantic

search

:

diversity

and

ambiguity

of

names

and

phrases

!Slide22

Outline

Opportunities

Now

Entity

Name

Disambiguation

Question

Answering

Disambiguation

Reloaded

Wrap-Up

Semantic

Search Today

Slide23

Three Different NLP Problems

Harry fought with you know who. He defeats the dark lord.

1) named-entity

detection

: segment & label by HMM or CRF

(e.g. Stanford NER tagger)

2) co-reference

resolution

: link to preceding NP

(trained classifier over linguistic features)

3) named-entity

disambiguation

:

map each mention (name) to canonical entity (entry in KB)

Three NLP tasks:

Harry

Potter

Dirty

Harry

Lord

Voldemort

The Who

(band)

Prince Harry

of England

3-

23Slide24

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

Named Entity Disambiguation

D5 Overview May 30, 2011

Sergio means Sergio_Leone

Sergio means Serge_Gainsbourg

Ennio means Ennio_Antonelli

Ennio means Ennio_Morricone

Eli means Eli_(bible)

Eli means ExtremeLightInfrastructure

Eli means Eli_Wallach

Ecstasy means Ecstasy_(drug)

Ecstasy means Ecstasy_of_Gold

trilogy means Star_Wars_Trilogy

trilogy means Lord_of_the_Rings

trilogy means Dollars_Trilogy

… … …

KB

Eli (bible)

Eli Wallach

Mentions

(surface names)

Entities

(meanings)

Dollars Trilogy

Lord of the Rings

Star Wars Trilogy

Benny Andersson

Benny Goodman

Ecstasy of Gold

Ecstasy (drug)

?

3-

24Slide25

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

Mention-Entity Graph

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy (drug)

Eli (bible)

Eli Wallach

KB+Stats

weighted undirected graph with two types of nodes

Popularity

(m,e):

freq(e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

bag-of-words or

language model:

words, bigrams,

phrases

3-

25Slide26

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

Mention-Entity Graph

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy (drug)

Eli (bible)

Eli Wallach

KB+Stats

weighted undirected graph with two types of nodes

Popularity

(m,e):

freq(e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

joint

mapping

3-

26Slide27

Mention-Entity Graph27 / 20

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy(drug)

Eli (bible)

Eli Wallach

KB+Stats

weighted undirected graph with two types of nodes

Popularity

(m,e):

freq(m,e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

Coherence

(e,e‘):

dist(types)

overlap(links)

overlap

(anchor words)

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

3-

27Slide28

Mention-Entity Graph28 / 20

KB+Stats

weighted undirected graph with two types of nodes

Popularity

(m,e):

freq(m,e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

Coherence

(e,e‘):

dist(types)

overlap(links)

overlap

(anchor words)

American Jews

film actors

artists

Academy Award winners

Metallica songs

Ennio Morricone songs

artifacts

soundtrack music

spaghetti westerns

film trilogies

movies

artifacts

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy (drug)

Eli (bible)

Eli Wallach

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

3-

28Slide29

Mention-Entity Graph29 / 20

KB+Stats

weighted undirected graph with two types of nodes

Popularity

(m,e):

freq(m,e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

Coherence

(e,e‘):

dist(types)

overlap(links)

overlap

(anchor words)

http://.../wiki/Dollars_Trilogy

http://.../wiki/The_Good,_the_Bad, _the_Ugly

http://.../wiki/Clint_Eastwood

http://.../wiki/Honorary_Academy_Award

http://.../wiki/The_Good,_the_Bad,_the_Ugly

http://.../wiki/Metallica

http://.../wiki/Bellagio_(casino)

http://.../wiki/Ennio_Morricone

http://.../wiki/Sergio_Leone

http://.../wiki/The_Good,_the_Bad,_the_Ugly

http://.../wiki/For_a_Few_Dollars_More

http://.../wiki/Ennio_Morricone

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy (drug)

Eli (bible)

Eli Wallach

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

3-

29Slide30

Mention-Entity Graph30 / 20

KB+Stats

Popularity

(m,e):

freq(m,e|m)

length(e)

#links(e)

Similarity

(m,e):

cos/Dice/KL

(context(m),

context(e))

Coherence

(e,e‘):

dist(types)

overlap(links)

overlap

(anchor words)

Metallica on Morricone tribute

Bellagio water fountain show

Yo-Yo Ma

Ennio Morricone composition

The Magnificent Seven

The Good, the Bad, and the Ugly

Clint Eastwood

University of Texas at Austin

For a Few Dollars More

The Good, the Bad, and the Ugly

Man with No Name trilogy

soundtrack by Ennio Morricone

weighted undirected graph with two types of nodes

Dollars Trilogy

Lord of the Rings

Star Wars

Ecstasy of Gold

Ecstasy (drug)

Eli (bible)

Eli Wallach

Sergio talked to

Ennio about

Eli‘s role in the

Ecstasy scene.

This sequence on

the graveyard

was a highlight in

Sergio‘s trilogy

of western films.

3-

30Slide31

Joint Mapping Build mention-entity graph or joint-inference factor graph

from knowledge and statistics in KB

Compute

high-likelihood mapping

(ML or MAP) or

dense subgraph such that:

each m is connected to exactly one e (or at most one e)

90

30

5

100

100

50

20

50

90

80

90

30

10

10

20

30

30

3-

31Slide32

Coherence Graph Algorithm

Compute

dense

subgraph

to maximize

min weighted degree

among entity

nodes such that: each m is

connected to

exactly one e (or

at most

one e)

Greedy approximation: iteratively remove

weakest entity and its

edges Keep alternative solutions, then

use local/randomized search

90

30

5

100

100

50

50

90

80

90

30

10

20

10

20

30

30

[J. Hoffart et al.: EMNLP‘11]

140

180

50

470

145

230

3-

32Slide33

Mention-Entity Popularity Weights

Collect

hyperlink

anchor

-text / link-target

pairs from

Wikipedia redirects

Wikipedia links between

articles

Interwiki links between Wikipedia editions

Web links pointing to

Wikipedia articles …

Build statistics to

estimate P[entity | name

]

Need dictionary with

entities‘ names:

full names: Arnold Alois Schwarzenegger, Los Angeles, Microsoft Corp.

short names

: Arnold, Arnie, Mr. Schwarzenegger, New York, Microsoft, …

nicknames & aliases: Terminator, City of Angels,

Evil Empire, … acronyms

:

LA, UCLA, MS, MSFT

role

names

:

the

Austrian

action

hero

,

Californian

governor

, CEO

of MS, … … plus

gender info

(useful for resolving pronouns in context):

Bill and Melinda met

at MS. They

fell in love

and he kissed

her.

[Milne/Witten 2008, Spitkovsky/Chang 2012]

3-33Slide34

Mention-Entity Similarity Edges

Extent of partial

matches

Weight of

matched

words

Precompute

characteristic

keyphrases

q

for

each

entity

e:

anchor

texts

or noun

phrases

in e

page

with

high

PMI:

Match

keyphrase

q

of

candidate

e in

context

of

mention

m

Compute

overall

similarity

of context(m) and

candidate e„

Metallica tribute to Ennio Morricone“

The Ecstasy

piece was covered by Metallica on the Morricone tribute

album.

3-

34Slide35

Entity-Entity Coherence EdgesPrecompute

overlap

of

incoming

links

for entities e1 and e2

Alternatively

compute

overlap

of

anchor texts

for e1 and e2

or overlap

of keyphrases, or

similarity of bag-of-words,

or …

Optionally

combine

with

type

distance

of

e1

and

e2

(e.g.,

Jaccard

index

for

type

instances)

For special

types of e1

and e2 (locations, people, etc.)use spatial

or temporal distance

3-35Slide36

AIDA: Accurate Online Disambiguationhttp://www.mpi-inf.mpg.de/yago-naga/aida/

3-

36Slide37

AIDA: Accurate Online Disambiguationhttp://www.mpi-inf.mpg.de/yago-naga/aida/

3-

37Slide38

http://www.mpi-inf.mpg.de/yago-naga/aida/

AIDA:

Very

Difficult

Example

3-

38Slide39

http://www.mpi-inf.mpg.de/yago-naga/aida/

AIDA:

Very

Difficult

Example

3-

39Slide40

AIDA: Accurate Online Disambiguationhttp://www.mpi-inf.mpg.de/yago-naga/aida/

3-

40Slide41

AIDA: Accurate Online Disambiguationhttp://www.mpi-inf.mpg.de/yago-naga/aida/

3-

41Slide42

Some NED Online Tools forJ. Hoffart et al.: EMNLP 2011, VLDB 2011https://d5gate.ag5.mpi-sb.mpg.de/webaida/

P.

Ferragina

, U.

Scaella

: CIKM 2010

http://tagme.di.unipi.it/R.

Isele, C. Bizer: VLDB 2012http://spotlight.dbpedia.org/demo/index.html

Reuters Open Calaishttp://viewer.opencalais.com/

S. Kulkarni, A. Singh, G. Ramakrishnan

, S. Chakrabarti: KDD 2009

http://www.cse.iitb.ac.in/soumen/doc/CSAW/D. Milne, I. Witten: CIKM 2008http://wikipedia-miner.cms.waikato.ac.nz/demos/annotate

/perhaps

moresome use Stanford NER

tagger for detecting

mentionshttp://nlp.stanford.edu/software/CRF-NER.shtml

3-42Slide43

NED: Experimental Evaluation

Benchmark:

Extended

CoNLL

2003

dataset

:

1400

newswire

articles

originally

annotated with

mention markup (NER),

now with NED mappings

to Yago and

Freebase difficult

texts: …

Australia beats

India …  Australian_Cricket_Team

… White House

talks

to

Kreml …

President_of_the_USA

… EDS

made

a

contract

with

HP_Enterprise_Services

Results

:Best: AIDA method

with prior+sim+coh + robustness

test82% precision @100% recall, 87%

mean average precisionComparison

to other methods,

see paper

J. Hoffart et al.: Robust Disambiguation of

Named Entities in Text, EMNLP 2011http://www.mpi-inf.mpg.de/yago-naga/aida/

3-43Slide44

Ongoing Research & Remaining Challenges

More

efficient

graph algorithms (multicore, etc.)

Short and

difficult texts

:

tweets, headlines, etc.

fictional texts: novels, song lyrics, etc.

incoherent texts

Disambiguation

beyond

entity

names:

coreferences:

pronouns, paraphrases, etc.

common

nouns, verbal phrases (

general WSD)

Leverage deep-parsing structures, leverage semantic types

Example: Page played Kashmir on his Gibson

subj

obj

mod

Allow mentions of

unknown entities

, mapped to null

Structured Web

data

:

tables

and

lists

3-

44Slide45

Variants of NED at Web Scale

How

to

run

this

on

big

batch of 1 Mio. input

texts? 

partition inputs across

distributed machines,

organize dictionary appropriately, …

 exploit

cross-document contexts

How to handle

Web-scale inputs (100 Mio.

pages) restricted to a set

of interesting

entities

?

(e.g.

tracking

politicians

and

companies

)

Tools

can

map

short text onto entities in a

few seconds

3-45Slide46

Outline

Opportunities

Now

Entity

Name

Disambiguation

Question

Answering

Disambiguation

Reloaded

Wrap-Up

Semantic

Search Today

Slide47

Deep Question Answering

99 cents got me a 4-pack of

Ytterlig

coasters from this Swedish chain

This town is known as "Sin City" & its downtown is "Glitter Gulch"

William Wilkinson's "An Account of the Principalities of Wallachia and Moldavia" inspired this author's most famous novel

As of 2010, this is the only

former Yugoslav republic in the EU

YAGO

knowledge

back-

ends

question

classification

&

decomposition

D.

Ferrucci

et al.:

Building Watson.

AI Magazine, Fall 2010.

IBM Journal

of

R&D 56(3/4), 2012:

This

is

Watson.Slide48

Semantic Keyword Search Need to map (groups of

)

keywords

onto

entities & relationshipsbased on name-entity

similarities/probabilities

q: composer

Rome scores westerns

[Ilyas et al. Sigmod‘10]

Media

Composer

video

editor

Western Digital

Rome

(

Italy

)

goal

in

football

film

music

composer

(

creator

of

music

)

Rome

(NY)

Lazio

Roma

western

movies

western

world

Western (

airline

)

AS

Roma

Western (NY)

born

in …

plays

for

used

in …

recorded

at

…Slide49

Natural Language Questions are Natural Who composed

scores

for

westerns

and is

from Rome

?

translate

question

into

Sparql

query

:

dependency

parsing

to

decompose

question

mapping

of

question

units

onto entities, classes, relations

Who

composed

scores

for

westerns

and

is from

Rome?

map

results

into

tabular

or

visual presentationor speech Slide50

From Questions to Queries

NL

question

:

Who

composed

scores

for

westerns

and

is from Rome

?

scores

for

westerns

is

from

Rome

Who

composed

scores

Dependency

parsing

exposes

structure

of

question

 „

triploids

(sub-

cues

)

2-

50Slide51

From Triploids to TriplesWho composed scores for westerns

and

is

from

Rome?

Who

is from

Rome

Who

composed

scores

scores

for

westerns

?x

composed

scores

?x

bornIn

Rome

scores

contributesTo

?y

?y type

westernMovie

?x type

composer

?x

composed

?s

?s

contributesTo

?y

?s type

music

2-

51Slide52

Pattern Dictionary for Relations

[N.

Nakashole

et al.: EMNLP 2012]

WordNet

-style dictionary/taxonomy for

relational phrases

based on

SOL patterns

(syntactic-lexical-ontological)

Relational phrases can be

synonymous

One relational phrase can

subsume

another

R

elational

phrases are

typed

Problem

:

cope

with

language

diversity

&

ambiguity

Example

:

composed

…,

wrote

…,

created

…, …

“graduated from”

“obtained degree in

*

from”

“and $PRP ADJ advisor”

 “under the supervision of”

“wife of”

“ spouse of”

<person>

graduated from

<university>

<singer>

released

<album>

<singer>

covered

<song> <book>

covered

<event>Slide53

PATTY: Pattern Taxonomy for Relations

[N.

Nakashole

et al.: EMNLP 2012,

demo

at

VLDB 2012]

350 000 SOL

patterns

with 4 Mio. instances

Derived

from large

data (Wikipedia, NYT, ClueWeb

)

by

scalable

sequence mining

a

ccessible at: www.mpi-inf.mpg.de/yago-naga/pattySlide54

Disambiguation Mapping for Triploids

Who

composed

scores

for

westerns

and

is

from

Rome

?

composed

composed

scores

s

cores

for

westerns

i

s

from

Rome

Who

q1

q2

q3

q4

Combinatorial

Optimization

by

ILP (

with

type

constraints

etc.)

e

:

Rome

(

Italy

)

e

: Lazio Roma

c

:

person

c

:

musician

e

: WHO

r

:

created

r

:

wroteComposition

r

:

wroteSoftware

c

:soundtrack

r

:

soundtrackFor

r

:

shootsGoalFor

r

:

bornIn

r

:

actedIn

c

: western

movie

e

: Western Digital

w

eighted

edges

(

coherence

,

similarity

, etc.)Slide55

Relaxing

Overconstrained

Queries

Select ?p

Where

{

?p

composed

?s . ?s type

music

.

?s

for

?m . ?m type

movie

. ?p

bornIn Rome . }

Select ?p Where {

?p composed ?s . ?s type music .

?s for ?m . ?m type

movie [western] .

?p bornIn Rome . }

Select ?p

Where

{

?p

?rel1

?s

[

composed

]

. ?s type

music

.

?s

?rel2

?m . ?m type

movie

[western]

. ?p bornIn

Rome . }

with

extended SPARQL-FullText: SPOX quad patterns

(S.

Elbassuoni

et al.: CIKM‘10, ESWC’11, SIGIR‘12)

Select ?p

Where

{

?p

composed ?s . ?s type music .

?s for

?m . ?m type movie [western] . ?p bornIn

Rome . } Slide56

Preliminary Results(M. Yahya et al.: WWW‘12, EMNLP‘12)

http://www.mpi-inf.mpg.de/yago-naga/deanna/Slide57

Outline

Opportunities

Now

Entity

Name

Disambiguation

Question

Answering

Disambiguation

Reloaded

Wrap-Up

Semantic

Search Today

Slide58

Disambiguation Mapping

Who

composed

scores

for

westerns

and

is

from

Rome

?

composed

composed

scores

s

cores

for

westerns

i

s

from

Rome

Who

q1

q2

q3

q4

e:Rome (

Italy

)

e:Lazio Roma

c:person

c:musician

e:WHO

r:created

r:wroteComposition

r:wroteSoftware

c:soundtrack

r:soundtrackFor

r:shootsGoalFor

r:bornIn

r:actedIn

c:western

movie

e:Western Digital

w

eighted

edges

(

coherence

,

similarity

, etc.)

Selection

:

X

i

Assignment

:

Y

ij

Joint

Mapping:

Z

kl

[

M.Yahya

et al.: EMNLP‘12]Slide59

Disambig

. Mapping:

Objective Function

Who

composed

scores

for

westerns

and

is

from

Rome

?

composed

composed

scores

s

cores

for

westerns

i

s

from

Rome

Who

q1

q2

q3

q4

e:Rome (

Italy

)

e:Lazio Roma

c:person

c:musician

e:WHO

r:created

r:wroteComposition

r:wroteSoftware

c:soundtrack

r:soundtrackFor

r:shootsGoalFor

r:bornIn

r:actedIn

c:western

movie

e:Western Digital

w

eighted

edges

(

coherence

,

similarity

, etc.)

Selection

:

X

i

Assignment

:

Y

ij

Joint

Mapping:

Z

kl

maximize

i,j

w

ij

Y

ij

+ 

k,l

v

kl

Z

kl

+…

subject

to

:

Y

ij

X

i

for

all

i,j

j

Y

ij

 1

for

all i

Z

kl

 

i,j

Y

ik

and

Z

kl  j Yil for all k,lXi,Yij,Zkl

 {0,1}wijvklSlide60

Disambig

. Mapping:

Constraints

Who

composed

scores

for

westerns

and

is

from

Rome

?

composed

composed

scores

s

cores

for

westerns

i

s

from

Rome

Who

q1

q2

q3

q4

e:Rome (

Italy

)

e:Lazio Roma

c:person

c:musician

e:WHO

r:created

r:wroteComposition

r:wroteSoftware

c:soundtrack

r:soundtrackFor

r:shootsGoalFor

r:bornIn

r:actedIn

c:western

movie

e:Western Digital

w

eighted

edges

(

coherence

,

similarity

, etc.)

Selection

:

X

i

Assignment

:

Y

ij

Joint

Mapping:

Z

kl

maximize

i,j

w

ij

Y

ij

+ 

k,l

v

kl

Z

kl

+…

subject

to

:

Q

hi

= 1  

g

Q

hg

= 3

for

all

h,i

X

i

+

X

g

 1

for

all

mutually

exclusive

i,g

Q

hi

= 1

 g,j Qhg Ygj = 1 for relation

nodes jwijvklSelection: QhiSlide61

Disambig

. Mapping:

Type

Constraints

Who

composed

scores

for

westerns

and

is

from

Rome

?

composed

composed

scores

s

cores

for

westerns

i

s

from

Rome

Who

q1

q2

q3

q4

e:Rome (

Italy

)

e:Lazio Roma

c:person

c:musician

e: WHO

r:created

r:wroteComposition

r:wroteSoftware

c:soundtrack

r:soundtrackFor

r:shootsGoalFor

r:bornIn

r:actedIn

c:western

movie

e:Western Digital

w

eighted

edges

(

coherence

,

similarity

, etc.)

Selection

:

X

i

Assignment

:

Y

ij

Joint

Mapping:

Z

kl

maximize

i,j

w

ij

Y

ij

+ 

k,l

v

kl

Z

kl

+…

subject

to

:

Y

ij

= 1

and

j

is

relation

node

and

Z

kj

=1

and

Z

jl

=1

domain

(j) 

types

(k)

and

range(j)  types(l) wijvkl

Selection: Qhi

ILP

optimizers

like

Gurobi

solve

this

in 1 or 2

secondsSlide62

Outline

Opportunities

Now

Entity

Name

Disambiguation

Question

Answering

Disambiguation

Reloaded

Wrap-Up

Semantic

Search Today

Slide63

Summary

Web

of

Data &

Knowledge

& Text (RDF +

P

hrases)

Calls for

S

emantic Search by

Entities, Classes &

Relations Diversity &

Ambiguity of Names

and Phrases

Calls for D

isambiguation Mapping

Strong Story for Entity

Name Disambiguation

Ongoing Work on Relation Phrase

Disambiguation

Cornerstone of Question Answering

with Natural Language or

Advanced

Keywords

Great

opportunity

towards

next

-generation

search

Challenging

problems

:

robustness, scale,

dynamics & transferSlide64

Take-Home Message

Solve

Who

composed

the

Ecstasy

and

other

pieces

for

westerns

?

can

solve

semantic

search

with

natural-

language

disambiguation