Luhao Zhang 1 Linmei Hu 1 Chuan Shi 1 1 Beijing University of Posts and Telecommunications China Report Luhao Zhang JIST 2019 CONTENTS 1 3 Background ICRE ID: 805616
Download The PPT/PDF document "Incorporating Instance Correlations in D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Incorporating Instance Correlations in Distantly Supervised Relation Extraction
Luhao
Zhang
1
,
Linmei
Hu
1
,
Chuan
Shi
1*1Beijing University of Posts and Telecommunications, China
Report
: Luhao Zhang
JIST-
2019
Slide2CONTENTS
1
3
Background
ICRE
Experiments
Conclusions
2
4
JIST-
2019
Slide3CONTENTS
1
3
Background
ICRE
Experiments
Conclusions
2
4
JIST-
2019
Slide41
Background
Relation Extraction
1
How do
we meet the needs of considerable
annotated data?
Show the strong power in many NLP tasks ;
Far from complete ;
Knowledge Graph
Relation Extraction
Given the text sequence and the entity pair (ℎ,
𝑡
), we assign it a relation label existing in knowledge graph.
Supervised
model
demands large manually annotated data;
Barack
Obama
Hawaii
born-in
Michelle
Obama
spouse-of
Democratic
Party
political-party
United
States
president-of
Knowledge
Graph
Donald Trump
president-of
Donald Trump
takes office as the president
of the
United States
Relation
Extraction
JIST-
2019
Slide51
Background
Distantly
Supervised
Relation Extraction
Assumption of Distant Supervision
If there exists a relation of a pair of entities in knowledge graph, all sentences mentioning those entities express the relation.
2
Problem
Distant
supervision
suffers from wrong labeling problem.
(
Barack_Obama
,
President,
United_States
)
Barack Obama
lifted
the
ban
on
travel
to
the
United States
…
Barack Obama
was
the
first
African
American
to
be
the
president
of
United States
.
Barack Obama
plans
to
produce
film
for
Netflix,
…
in the
United
States
.
…
Freebase
President
Barack
Obama
United
States
✔️
✔️
❌
JIST-
2019
Slide61
Background
Conventional Models
3
Existing
methods
Method
1:
Typical attention models without external information;
Method
2
:
Attention models using KGs ;
Method
3
:
Attention models using side information such as entity
descriptions;
JIST-
2019
Slide71
Background
Motivation
4
Motivation
The existing
can provide significant background knowledge without other side information;
The
background
knowledge
can
be propagated
through
build
ding
the correlation among multiple instances
.
JIST-
2019
Slide81
Background
Our
Idea
5
JIST-
2019
Our
Idea
Construct the graph for each bag based on dependency trees;
Utilize a GCN that maps every node into an embedding vector;
Feed
the learned node (word) embeddings into
the
relation
classifier
I
ncorporating Instance
C
orrelations in Distantly Supervised
R
elation
E
xtraction
(
ICRE
)
Slide9CONTENTS
1
3
Background
ICRE
Experiments
Conclusions
2
4
JIST-
2019
Slide102
ICRE
Our
Proposed
Model
6
③
Relation
Classifier
②
Graph
Convolution
①
Graph
Construction
I
ncorporating Instance
C
orrelations in Distantly Supervised
R
elation
E
xtraction
(
ICRE
)
JIST-
2019
Slide112
ICRE
Graph
Construction
7
Graph
Construction
We first get the dependency parse tree for all instances;
The
graph G(V, E) of the
bag is constructed through the common words
or
the
entity
pair
.
Vertex
set
consists
of
the
words;
denote the graph’s feature matrix;
The adjacency matrix
The
degree matrix
, where
JIST-
2019
Slide122
ICRE
Graph
Convolution
Network
8
Graph
Convolution
Network
Normalize the adjacency matrix:
Compute
the
new
node
representations
as:
Aggregate2
Aggregate1
S
tack multiple graph convolution layers
t
o extract the higher-order substructure features
:
JIST-
2019
We build the correlation between instances while not lose the semantic information through the dependency tree;
Through higher-order convolution operation, the background knowledge implied in other instances is propagated.
Slide132
ICRE
9
Relation
Classifier
Relation
Classifier
JIST-
2019
We
taking the learned embeddings V as the representations of words
to
train
our
relation
classifier.
Instance-over
attention
mechanism:
Relation
classifier:
Slide14CONTENTS
1
3
Background
ICRE
Experiments
Conclusions
2
4
JIST-
2019
Slide153
Experiments
Datasets
&
Baselines
10
Datasets
Baselines
Mintz
MultiR
MIMLRE
CNN+ATT
JIST-
2019
Dataset
Data
split
Sentences
Entities
Entity
Pairs
NYT
Train
570,088
63,696
281,270
Test
172,448
16,706
96,678
GIDS
Train
11,297
9,874
6,498
Test
5,663
5,226
3,247
NYT
GIDS
Slide163
Experiments
11
Precision-Recall Curves on Both Datasets
ICRE achieves
the best performance
compared to all the baselines;
ICRE utilizes GCN to
encode the correlation between instances
, resulting in the implied
background knowledge is propagated
in the constructed graph.
JIST-
2019
Slide173
Experiments
12
Evaluation on Two
Datasets
ICRE also
achieves
the best performance
compared to all the baselines.
P@N(%)
100
200
300
Mean
Mintz
54.0
50.5
45.3
49.9
MultiR
75.0
65.0
62.0
67.3
MIMLRE
70.0
64.5
60.3
64.9
CNN+ATT
76.2
73.1
67.4
72.2
ICRE
78.4
75.2
68.5
74.0
P@N Evaluation on NYT Dataset
Results On GIDS Dataset
JIST-
2019
Slide183
Experiments
13
Effect
on
Instance
Number
To verify the performance of our model on
those entity pairs with few instances
, we randomly select one, two and all instances for each entity pair;
ICRE
achieves
better performance
than all baseline methods in all case;
JIST-
2019
Slide19CONTENTS
1
3
Background
ICRE
Experiments
Conclusions
2
4
JIST-
2019
Slide204
Conclusions
We propose a
novel GCN based model ICRE to incorporate the instance correlations
for improving relation extraction;
The learned node embeddings through GCNs are viewed as our new word embeddings, which may
contain the implied background knowledge in other instances
.
Extensive experiments have demonstrated that our model outperforms compared
methods.
14JIST-2019
Slide21Thank
you ! Q&A
JIST-
2019