/
Distant Supervision for Knowledge Base Population Distant Supervision for Knowledge Base Population

Distant Supervision for Knowledge Base Population - PowerPoint Presentation

jane-oiler
jane-oiler . @jane-oiler
Follow
411 views
Uploaded On 2016-03-25

Distant Supervision for Knowledge Base Population - PPT Presentation

Mihai Surdeanu David McClosky John Bauer Julie Tibshirani Angel Chang Valentin Spitkovsky Christopher Manning Definition and Approach We took part in TAC KBP 2010 this year both tasks ID: 269340

slots slot kbp university slot slots university kbp buffett query warren sentences entity training candidates top distant supervision pennsylvania

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Distant Supervision for Knowledge Base P..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Distant Supervision for Knowledge Base Population

Mihai Surdeanu, David

McClosky

, John Bauer, Julie

Tibshirani

, Angel Chang,

Valentin

Spitkovsky, Christopher ManningSlide2

Definition and Approach

We took part in TAC KBP 2010 this year (both tasks)

Slot filling task: learning a pre-defined set of relations and attributes for target entities based on documents in a collection

“Warren Buffett began studying at the

Warton

School of Finance at the University of Pennsylvania, but transferred to the University of Nebraska where he graduated.”

(

per:schools_attended

, Warren Buffett, University of Pennsylvania)

(

per:schools_attended

, Warren Buffett, University of Nebraska

Distant supervision approach: generate training data automatically from Wikipedia

infoboxesSlide3

Infobox

KB

Map

infobox

fields to KBP slots

(one to many mapping)

IR: find relevant sentences

Query: entity name + slot value

Extract +/- slot candidates

Train multiclass classifier

Map KBP

slots to

fine-grained

NE labels

KBP query: entity name

IR: find relevant sentences

Query: entity name + trigger words

Extract slot candidates

Classify candidates

Inference (greedy, local)

Training

Evaluation

Extracted

slotsSlide4

Results

Label

Correct

Predict

Actual

P

RF1UNRELATED268085

28913529559092.790.791.7org:city_of

_headquarters583590407514

64.577.770.5org:country_of_headquarters

28514638372561.576.568.2

org:founded38968199666247.5

58.552.4org:parents11582292

252550.545.948.1org:top_members/employees1282

3067359641.835.738.5

per:city_of_birth17993920325245.9

55.350.2per:country_of_birth19844122

320448.1

61.954.2per:date_of_birth

39385427

4362

72.690.3

80.5per:member_of

17713018

288758.7

61.3

60per:title

1714

336430545156.153.4

Total37169688226236754

59.656.7Training on 2/3 of infoboxes, evaluatingon 1/3Evaluating only on

sentences that containat least a valid slot

Top 10most commonslotsTotal for

all slotsSlide5

Challenges

Improve quality of data generated through distant supervision

Improve IR recall

Use relation-specific trigger words (or

n

-grams or dependency paths etc.) to boost sentences likely to contain answers to the topHow to acquire these automatically?Better classifiers for noisy text (e.g., web snippets)