/
PRIS at Slot Filling in KBP 2012: PRIS at Slot Filling in KBP 2012:

PRIS at Slot Filling in KBP 2012: - PowerPoint Presentation

pamella-moone
pamella-moone . @pamella-moone
Follow
349 views
Uploaded On 2018-11-25

PRIS at Slot Filling in KBP 2012: - PPT Presentation

An Enhanced Adaboost PatternMatching System Yan Li Beijing University of Posts and Telecommunications buptliyangmailcom Outline Introduction Preprocessing Entity Expansion Pattern bootstrapping ID: 733794

death pattern entity org pattern death org entity country members member alternate city state birth date bootstrapping slots loc residence slot names

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "PRIS at Slot Filling in KBP 2012:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

PRIS at Slot Filling in KBP 2012: An Enhanced Adaboost Pattern-Matching System

Yan Li

Beijing University of Posts and Telecommunications

buptliyan@gmail.comSlide2

OutlineIntroductionPreprocessingEntity Expansion

Pattern bootstrapping

Post-processing

Evaluation results

ConclusionSlide3

Introduction: the frameworkSlide4

PreprocessingNLP (the Standford

CoreNLP

toolkit)

POS tagger

NER

Date and time expression recognitionDependency parserCoreference resolutionSlide5

Preprocessing (cont’)Example:

Takeshi Watanabe, the first president of the ADB, died in his native Japan.

The categorizations of slotsSlide6

PER

ORG

Domain

Slots

Domain

Slots

PER

alternate_names

; spouses; children; parents; siblings;

other_family

PER

alternate_names

; members; shareholders;

founded_by

;

top_members

/

emplyees

ORG

member_of

;

employee_of

ORG

parents; members;

member_of

; shareholders; subsidiaries

LOC

country/state/

city_of_birth

/death/residence

DATE

date_of_birth

/death

LOC

member_of

; country/state/

city_of_headquarters

;

NUM

age

ORI

origin

REL

religion

DATE

founded; dissolved

SCHOOL

schools_attended

NUM

number_of_employees

/members

CAUSE

cause_of_death

TITLE

titles

URL

website

CHARGE

charges

REL

political/

religious_affiliationSlide7
Slide8

Entity ExpansionThe coreferences

and alternate names of an entity exist in relevant documents.

In the purpose of improving recall.

Scheme 1 (PER & ORG):

coreference

resolutionThe relation chain run by the Stanford CoreNLP.Example:Slide9

Entity Expansion (cont’)Scheme 2 (PER & ORG): identifying alternate namesRule-based information extraction

Interpretative entities in parenthesis

Example:

Starr International Co.

, known as

SICO, ……Scheme 3 (ORG)Removing the corporate suffixes in queriesFinding the acronyms or full expressions

Example:Norwegian University of Science and Technology (NTNU)Slide10
Slide11

Pattern Bootstrapping: Workflow

Ralph

Grishman

and

Bonan

Min, “New York University KBP 2010 Slot‐Filling System

”, 2010.Slide12

Pattern Bootstrapping: Seed Pairs

The KBP English Monolingual Slot Filling Evaluation Data in the past three years

92 PER entities

106 ORG entities

1,627 entity-value pairsSlide13

Word sequence patternthe middle context between an entity-value pairExample:

PER:countries_of_residence

<PER> native <LOC>

Dependency path pattern

the shortest dependency path which connects an entity-value pair

Example: PER:title <PER> appos

<TITLE>

PER:member_of <PER> appos president prep_of

<ORG>

PER:country_of_death

<PER> nsubj-1 died

prep_in

<LOC>

Pattern Bootstrapping:

Pattern GenerationSlide14

Pattern Bootstrapping: Pattern Evaluation

In the purpose of improving precision

Pattern frequency

Trigger phrase

High-confidence patterns

New entity-value pairsIterationSlide15
Slide16

Post-processingIn the purpose of improving precisionDATE

The

SUTime

module of the

CoreNLP

TIMEX2 normalizationPER: spouses, children and parentsLast name complementExample: John Doe’s first wife, Ruth

“Ruth Doe” is better than

“Ruth”. Slide17

Post-processing (cont’)Identifying countries,

states/provinces

and

cities

for LOC slots

A Wikipedia list containing all countries and states or provinces.Adding modifiers into fillers of per: titleadjectival modifier: financial Ministernoun compound modifier:

police chiefprepositional modifier: chief of military operationsSlide18

Evaluation ResultsPRIS

Summary Statistics

LDC

Top-1

Top-2

Median

Precision

0.9278607

0.6757322

0.48955223

0.11392405

Recall

0.7252106

0.41866493

0.21257292

0.0874919

F1

0.8141142

0.5170068

0.2964302

0.0989736Slide19

Slot

non-NIL correct

redundant

inexact

wrong

missing

Alternate names

6

0

0

0

23

Date of birth

16

4

0

1

1

Date of death

17

1

0

4

2

age

22

0

0

2

2

Country of birth

1

0

0

0

1

State

or province of birth

8

0

2

3

2

City of birth

13

1

0

5

2

Country of death

1

0

0

2

0

State or province of death

13

0

2

1

2

City of death

17

0

0

4

1

Country of residence

10

2

2

7

3

State or province of

residence

22

1

4

5

13

City of residence

35

1

0

14

8

origin

16

2

0

17

0

Cause of death

18

0

0

1

13

Schools attended

19

7

0

1

14

titles

85

13

8

24

4

Member of

26

2

4

17

10

Employee of

7

0

2

5

20

religion

4

0

0

1

3

spouses

16

5

1

3

10

Children

73

0

3

10

6

Parents

21

4

0

1

4

Siblings

20

0

1

8

3

Other family

2

0

0

0

7

Charges

5

0

0

4

2Slide20

Slot

non-NIL correct

redundant

inexact

wrong

missing

Alternate names

46

4

5

25

5

Political/religious affiliations

7

1

0

6

3

Top members/employees

59

1

2

20

8

Number

of

employees/members

3

0

0

0

8

Members

0

0

0

0

4

Member of

0

0

0

0

7

Subsidiaries

7

0

0

3

10

Parents

4

1

0

4

4

Founded by

5

0

0

3

5

Founded

5

0

0

1

3

Dissolved

1

0

0

0

2

Country of headquarters

3

0

0

1

20

State or province of headquarters

1

1

0

7

11

City

of headquarters

2

0

0

3

10

Shareholders

3

0

1

8

0

Website

7

0

0

1

8Slide21

ConclusionIn the slot filling task of KBP 2012, we designed an enhanced pattern-matching system

which consists of

preprocessing

,

entity expansion

, pattern bootstrapping and post-processing.The precision and recall are relatively good for some specific slots.

It is urgent to improve the remaining slots.Slide22

TipsAdequate preparationA harmonious teamActive and disciplined environment

Be passionate, patient and hardworking

……Slide23

Thank you!