/
Incorporating Instance Correlations in Distantly Supervised Relation Extraction Incorporating Instance Correlations in Distantly Supervised Relation Extraction

Incorporating Instance Correlations in Distantly Supervised Relation Extraction - PowerPoint Presentation

startse
startse . @startse
Follow
345 views
Uploaded On 2020-08-27

Incorporating Instance Correlations in Distantly Supervised Relation Extraction - PPT Presentation

Luhao Zhang 1 Linmei Hu 1 Chuan Shi 1 1 Beijing University of Posts and Telecommunications China Report Luhao Zhang JIST 2019 CONTENTS 1 3 Background ICRE ID: 805616

jist 2019 relation icre 2019 jist icre relation background graph experiments knowledge instances obama united states president barack classifier

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Incorporating Instance Correlations in D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Incorporating Instance Correlations in Distantly Supervised Relation Extraction

Luhao

Zhang

1

,

Linmei

Hu

1

,

Chuan

Shi

1*1Beijing University of Posts and Telecommunications, China

Report

: Luhao Zhang

JIST-

2019

Slide2

CONTENTS

1

3

Background

ICRE

Experiments

Conclusions

2

4

JIST-

2019

Slide3

CONTENTS

1

3

Background

ICRE

Experiments

Conclusions

2

4

JIST-

2019

Slide4

1

Background

Relation Extraction

1

How do

we meet the needs of considerable

annotated data?

Show the strong power in many NLP tasks ;

Far from complete ;

Knowledge Graph

Relation Extraction

Given the text sequence and the entity pair (ℎ,

𝑡

), we assign it a relation label existing in knowledge graph.

Supervised

model

demands large manually annotated data;

Barack

Obama

Hawaii

born-in

Michelle

Obama

spouse-of

Democratic

Party

political-party

United

States

president-of

Knowledge

Graph

Donald Trump

president-of

Donald Trump

takes office as the president

of the

United States

Relation

Extraction

JIST-

2019

Slide5

1

Background

Distantly

Supervised

Relation Extraction

Assumption of Distant Supervision

If there exists a relation of a pair of entities in knowledge graph, all sentences mentioning those entities express the relation.

2

Problem

Distant

supervision

suffers from wrong labeling problem.

(

Barack_Obama

,

President,

United_States

)

Barack Obama

lifted

the

ban

on

travel

to

the

United States

Barack Obama

was

the

first

African

American

to

be

the

president

of

United States

.

Barack Obama

plans

to

produce

film

for

Netflix,

in the

United

States

.

Freebase

President

Barack

Obama

United

States

✔️

✔️

JIST-

2019

Slide6

1

Background

Conventional Models

3

Existing

methods

Method

1:

Typical attention models without external information;

Method

2

:

Attention models using KGs ;

Method

3

:

Attention models using side information such as entity

descriptions;

JIST-

2019

Slide7

1

Background

Motivation

4

Motivation

The existing

can provide significant background knowledge without other side information;

The

background

knowledge

can

be propagated

through

build

ding

the correlation among multiple instances

.

 

JIST-

2019

Slide8

1

Background

Our

Idea

5

JIST-

2019

Our

Idea

Construct the graph for each bag based on dependency trees;

Utilize a GCN that maps every node into an embedding vector;

Feed

the learned node (word) embeddings into

the

relation

classifier

I

ncorporating Instance

C

orrelations in Distantly Supervised

R

elation

E

xtraction

(

ICRE

)

Slide9

CONTENTS

1

3

Background

ICRE

Experiments

Conclusions

2

4

JIST-

2019

Slide10

2

ICRE

Our

Proposed

Model

6

Relation

Classifier

Graph

Convolution

Graph

Construction

I

ncorporating Instance

C

orrelations in Distantly Supervised

R

elation

E

xtraction

(

ICRE

)

JIST-

2019

Slide11

2

ICRE

Graph

Construction

7

Graph

Construction

We first get the dependency parse tree for all instances;

The

graph G(V, E) of the

bag is constructed through the common words

or

the

entity

pair

.

 

Vertex

set

consists

of

the

words;

denote the graph’s feature matrix;

The adjacency matrix

The

degree matrix

, where

 

JIST-

2019

Slide12

2

ICRE

Graph

Convolution

Network

8

Graph

Convolution

Network

Normalize the adjacency matrix:

Compute

the

new

node

representations

as:

Aggregate2

Aggregate1

S

tack multiple graph convolution layers

t

o extract the higher-order substructure features

:

JIST-

2019

We build the correlation between instances while not lose the semantic information through the dependency tree;

Through higher-order convolution operation, the background knowledge implied in other instances is propagated.

Slide13

2

ICRE

9

Relation

Classifier

Relation

Classifier

JIST-

2019

We

taking the learned embeddings V as the representations of words

to

train

our

relation

classifier.

Instance-over

attention

mechanism:

Relation

classifier:

Slide14

CONTENTS

1

3

Background

ICRE

Experiments

Conclusions

2

4

JIST-

2019

Slide15

3

Experiments

Datasets

&

Baselines

10

Datasets

Baselines

Mintz

MultiR

MIMLRE

CNN+ATT

JIST-

2019

Dataset

Data

split

Sentences

Entities

Entity

Pairs

NYT

Train

570,088

63,696

281,270

Test

172,448

16,706

96,678

GIDS

Train

11,297

9,874

6,498

Test

5,663

5,226

3,247

NYT

GIDS

Slide16

3

Experiments

11

Precision-Recall Curves on Both Datasets

ICRE achieves

the best performance

compared to all the baselines;

ICRE utilizes GCN to

encode the correlation between instances

, resulting in the implied

background knowledge is propagated

in the constructed graph.

JIST-

2019

Slide17

3

Experiments

12

Evaluation on Two

Datasets

ICRE also

achieves

the best performance

compared to all the baselines.

P@N(%)

100

200

300

Mean

Mintz

54.0

50.5

45.3

49.9

MultiR

75.0

65.0

62.0

67.3

MIMLRE

70.0

64.5

60.3

64.9

CNN+ATT

76.2

73.1

67.4

72.2

ICRE

78.4

75.2

68.5

74.0

P@N Evaluation on NYT Dataset

Results On GIDS Dataset

JIST-

2019

Slide18

3

Experiments

13

Effect

on

Instance

Number

To verify the performance of our model on

those entity pairs with few instances

, we randomly select one, two and all instances for each entity pair;

ICRE

achieves

better performance

than all baseline methods in all case;

JIST-

2019

Slide19

CONTENTS

1

3

Background

ICRE

Experiments

Conclusions

2

4

JIST-

2019

Slide20

4

Conclusions

We propose a

novel GCN based model ICRE to incorporate the instance correlations

for improving relation extraction;

The learned node embeddings through GCNs are viewed as our new word embeddings, which may

contain the implied background knowledge in other instances

.

Extensive experiments have demonstrated that our model outperforms compared

methods.

14JIST-2019

Slide21

Thank

you ! Q&A

JIST-

2019