Embeddings An Empirical Comparison Shyam Upadhyay Manaal Faruqui Chris Dyer Dan Roth Crosslingual Embeddings 2 children enfants money argent loi law life vie monde world pays ID: 920058
Download Presentation The PPT/PDF document "Cross-lingual Models of Word" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Cross-lingual Models of Word Embeddings: An Empirical Comparison
Shyam UpadhyayManaal FaruquiChris DyerDan Roth
Slide2Cross-lingual Embeddings2
children
enfants
money
argent
loi
law
life
vie
monde
world
pays
country
war
guerre
peace
paix
energy
energie
market
marche
Vectors in L1
Vectors in L2
Slide3Works on Cross-Lingual Embeddings(2012)
Klementiev et al. CoLING(2013) Zou et al. EMNLP, Mikolov et al.
Arxiv(2014) Hermann et al. ACL, Faruqui
et al. EACL,
Kocisky
et
al.
ACL,
Chandar et al. NIPS
(2015) CamachoCollados et al. ACL, Lu et al. NAACL, Luong
et al. NAACL, Guows et al. ICML, Ishiwatari et al CoNLL, Shi
et al, Guo et al, Gardner et al. EMNLP, Coulmance et al. EMNLP, Vulic et al. ACL
(2016) Already a few papers
in NAACL, ACL.
3
Which approach
is best suited for my task?
Slide4Overview of The TalkGeneral SchemaForms of Cross-lingual Supervision
General AlgorithmComparison SetupResultsConclusion
4
Slide5General Schema5
Cross-lingual
Supervision
L1 and L2
Cross-lingual Word Vector Model
Initial
embedding (Optional)
W
Initial embedding (Optional)
V
Vectors in L1
Vectors in L2
Slide6Decreasing Cost
(You, t’)
(Love,
aime
)
(I, je)
word
Je
I
t
’
aime
love
You
w
ord + sentence
Forms of Cross-lingual Supervision
6
BiSkip
Luong
et al. 15
BiCVM
Hermann et al. 14
BiCCA
Faruqui
et al. 14
BiVCD
Vulic
et al. 15
Je t’
aime
I love you
Bonjour! Je t’
aime
Hello! How are you? I love you
sentence
document
Slide7General Algorithm7
Mono. Obj. of L1
Cross-lingual Obj.
Mono. Obj. of L2
Vectors for L1
Vectors for L2
Slide8Comparison SetupWe compare onMono-lingual Word SimilarityCross-lingual Dictionary Induction
Cross-lingual Document ClassificationCross-lingual Dependency ParsingAll models trained using parallel data for 4 languages – DE, FR, SV, ZHWe select parameters by picking the setting which did best on an average across all tasks.
8
Slide9Mono-lingual Word SimilarityAre English embeddings
obtained from cross-lingual training better than those obtained from monolingual training?Evaluated usingSimlex-999 - a standard word similarity dataset Qvec - an intrinsic embedding evaluation, which is shown to correlate with linguistic information.
9
Slide10Mono-lingual Word Similarity10
Performance does
not correlate with performance in
later downstream applications.
Decreasing Cost of Supervision
Decreasing Cost of Supervision
Slide11Cross-lingual Dictionary InductionFor a word in English, find its top-10 neighbors in the foreign language.Evaluate neighbors against possible translations (per a gold dictionary).
Gold Dictionaries induced using aligned synsets from Multilingual Wordnet (Bond and Foster 2013).
11
Gold Dictionary
(
white,
blanc
)
…
(past,
passe)…(watch, garde
)…(school, école)
…
past
baisse
passe
accepter
…....…...........
Slide12Results12
Performance improves with cost of supervision, with gaps > 10
pts
b/w some models.
Decreasing Cost of Supervision
Slide13Cross-lingual Document Classification13
Train the Classifier
Labeled Documents in L1
Test the Classifier
Documents in L2
Training on L1
Testing on L2
Document classification model
Vectors in L1
Vectors in L2
Slide14Results14
When transferring semantic knowledge across languages,
s
entence + word alignment information is superior to sentence or word alignment alone.
Decreasing Cost of Supervision
Decreasing Cost of Supervision
Slide15Cross-Lingual Dependency Parsing15
Train the Parser
Treebank in L1
Test the Parser
Treebank in L2
Training on L1
Testing on L2
Parsing
model
Vectors in L1
Vectors in L2
We also consider the case when L1=L2
Slide16Results16
When transferring syntactic knowledge across languages, using word alignments for training
embeddings
is crucial.
Word-Level Alignment
Word-Level Alignment
Slide17ConclusionComparison of 4 representative models on several tasks.Provided insight
into relation between type of application and required form of supervision.Supervision with word-level and sentence-level alignment (almost) always superior to word-level or sentence-level alignment alone for semantic tasks.Supervision with word alignment crucial to
performance for syntactic tasks.Our experimental setup is modular and easy to use.
V
ectors
and scripts available at
github.com
/
shyamupa
/biling
-survey
17