Dynamic Oracles in Constituency Parsing Daniel Fried and Dan Klein Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing Daniel Fried and Dan Klein Policy Gradient as a Proxy for ID: 713088
Download Presentation The PPT/PDF document "Policy Gradient as a Proxy for" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Daniel Fried and Dan KleinSlide2
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Daniel Fried and Dan KleinSlide3
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Daniel Fried and Dan KleinSlide4
Parsing by Local DecisionsThe
cat
took
a
nap
.
NP
NP
VP
S
(S
(NP
The
cat
)
(VP
…Slide5
Non-local ConsequencesExposure Bias
Prediction
True
Parse
(S
(NP
The
(S
(VP
(NP
cat
??
[
Ranzato
et al. 2016; Wiseman and Rush 2016]
…
Loss-Evaluation Mismatch
The
cat
took
a
nap
.
NP
NP
VP
S
The
cat
took
a
nap
.
VP
NP
VP
S
NP
: -F1
Slide6
Dynamic Oracle TrainingPrediction
(sample, or greedy)True Parse
(S
(NP
The
(S
(VP
(NP
cat
…
The
The
(NP
Oracle
The
cat
…
Explore at training time. Supervise each state with
an expert policy.
choose to maximize achievable F1 (typically)
addresses loss mismatch
addresses exposure bias
[Goldberg &
Nivre
2012; Ballesteros et al. 2016; inter alia]Slide7
Dynamic Oracles Help!Expert Policies / Dynamic Oracles
Daume III et al., 2009; Ross et al., 2011;
Choi and Palmer, 2011; Goldberg and
Nivre
, 2012;
Chang et al., 2015; Ballesteros et al., 2016; Stern et al. 2017
System
Static Oracle
Dynamic
Oracle
Coavoux
and
Crabbé
, 2016
88.6
89.0
Cross and Huang, 2016
91.0
91.3
Fernández
-González and
Gómez-Rodríguez, 2018
91.5
91.7
PTB Constituency Parsing F1
m
ostly dependency parsingSlide8
What if we don’t have a dynamic oracle?
Use
r
einforcement learningSlide9
Reinforcement Learning Helps! (in other tasks)Auli
and Gao, 2014; Ranzato et al., 2016; Shen et al., 2016
machine translation
Xu et al., 2016; Wiseman and Rush, 2016;
Edunov
et al. 2017
machine translation
several, including dependency parsing
CCG parsingSlide10
Policy Gradient Training[Williams, 1992]
Minimize expected sequence-level cost:
a
ddresses exposure bias (compute by sampling)
addresses loss mismatch
(compute F1)
compute in the same way as for the
true tree
The
man
had
an
idea
.
NP
NP
VP
S
The
man
had
an
idea
.
NP
NP
VP
S
NP
Prediction
True Parse
Slide11
Policy Gradient Training
(negative F1)
The cat took a nap.
The
cat
took
a
nap
.
NP
NP
VP
S
NP
The
cat
took
a
nap
.
NP
NP
VP
S-INV
The
cat
took
a
nap
.
NP
NP
ADJP
S
gradient
for candidate
The
cat
took
a
nap
.
NP
NP
VP
S
k
candidates,
Input,
Slide12
ExperimentsSlide13
Setup
Parsers
Span-Based [Cross & Huang, 2016]
Top-Down [Stern et al. 2016]
RNNG [Dyer et al. 2016]
In-Order [Liu and Zhang, 2017]
Training
Static oracle
Dynamic oracle
Policy gradient
xSlide14
English PTB F1Slide15
Training EfficiencyPTB learning curves for the Top-Down parserSlide16
French Treebank F1Slide17
Chinese Penn Treebank v5.1 F1Slide18
ConclusionsLocal decisions can have non-local consequences
Loss mismatchExposure bias
How to deal with the issues caused by local decisions?
Dynamic oracles: efficient, model specific
Policy gradient: slower to train, but general purposeSlide19
Thank you!Slide20
For Comparison: A Novel Oracle for RNNG
(S
(NP
The
man
1. Close current constituent if it’s a true constituent…
… or it could never be a true constituent.
2
. Otherwise, open the outermost unopened true constituent at this position.
3. Otherwise, shift the next word.
(S
(NP
The
man
)
(VP
had
)
(S
(NP
The
man
)
(VP
)
(S
(NP
The
man
)
(VP
(S
(NP
The
man
)
(VP
had
…Slide21
What if we don’t have a dynamic oracle?
Define oneSlide22
For Comparison: A Novel Oracle for RNNG
(S
(NP
The
man
1. Close current constituent if it’s a true constituent…
… or it could never be a true constituent.
2
. Otherwise, open the outermost unopened true constituent at this position.
3. Otherwise, shift the next word.
(S
(NP
The
man
)
(VP
had
)
(S
(NP
The
man
)
(VP
)
(S
(NP
The
man
)
(VP
(S
(NP
The
man
)
(VP
had
…