/
Cross-lingual Models of Word Cross-lingual Models of Word

Cross-lingual Models of Word - PowerPoint Presentation

margaret
margaret . @margaret
Follow
342 views
Uploaded On 2022-06-18

Cross-lingual Models of Word - PPT Presentation

Embeddings An Empirical Comparison Shyam Upadhyay Manaal Faruqui Chris Dyer Dan Roth Crosslingual Embeddings 2 children enfants money argent loi law life vie monde world pays ID: 920058

word lingual supervision cross lingual word cross supervision vectors alignment cost level decreasing training acl sentence aime document performance

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Cross-lingual Models of Word" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Cross-lingual Models of Word Embeddings: An Empirical Comparison

Shyam UpadhyayManaal FaruquiChris DyerDan Roth

Slide2

Cross-lingual Embeddings2

children

enfants

money

argent

loi

law

life

vie

monde

world

pays

country

war

guerre

peace

paix

energy

energie

market

marche

Vectors in L1

Vectors in L2

Slide3

Works on Cross-Lingual Embeddings(2012)

Klementiev et al. CoLING(2013) Zou et al. EMNLP, Mikolov et al.

Arxiv(2014) Hermann et al. ACL, Faruqui

et al. EACL,

Kocisky

et

al.

ACL,

Chandar et al. NIPS

(2015) Camacho­Collados et al. ACL, Lu et al. NAACL, Luong

et al. NAACL, Guows et al. ICML, Ishiwatari et al CoNLL, Shi

et al, Guo et al, Gardner et al. EMNLP, Coulmance et al. EMNLP, Vulic et al. ACL

(2016) Already a few papers

in NAACL, ACL.

3

Which approach

is best suited for my task?

Slide4

Overview of The TalkGeneral SchemaForms of Cross-lingual Supervision

General AlgorithmComparison SetupResultsConclusion

4

Slide5

General Schema5

Cross-lingual

Supervision

L1 and L2

Cross-lingual Word Vector Model

Initial

embedding (Optional)

W

Initial embedding (Optional)

V

Vectors in L1

Vectors in L2

Slide6

Decreasing Cost

(You, t’)

(Love,

aime

)

(I, je)

word

Je

I

t

aime

love

You

w

ord + sentence

Forms of Cross-lingual Supervision

6

BiSkip

Luong

et al. 15

BiCVM

Hermann et al. 14

BiCCA

Faruqui

et al. 14

BiVCD

Vulic

et al. 15

Je t’

aime

I love you

Bonjour! Je t’

aime

Hello! How are you? I love you

sentence

document

Slide7

General Algorithm7

Mono. Obj. of L1

Cross-lingual Obj.

Mono. Obj. of L2

Vectors for L1

Vectors for L2

Slide8

Comparison SetupWe compare onMono-lingual Word SimilarityCross-lingual Dictionary Induction

Cross-lingual Document ClassificationCross-lingual Dependency ParsingAll models trained using parallel data for 4 languages – DE, FR, SV, ZHWe select parameters by picking the setting which did best on an average across all tasks.

8

Slide9

Mono-lingual Word SimilarityAre English embeddings

obtained from cross-lingual training better than those obtained from monolingual training?Evaluated usingSimlex-999 - a standard word similarity dataset Qvec - an intrinsic embedding evaluation, which is shown to correlate with linguistic information.

9

Slide10

Mono-lingual Word Similarity10

Performance does

not correlate with performance in

later downstream applications.

Decreasing Cost of Supervision

Decreasing Cost of Supervision

Slide11

Cross-lingual Dictionary InductionFor a word in English, find its top-10 neighbors in the foreign language.Evaluate neighbors against possible translations (per a gold dictionary).

Gold Dictionaries induced using aligned synsets from Multilingual Wordnet (Bond and Foster 2013).

11

Gold Dictionary

(

white,

blanc

)

(past,

passe)…(watch, garde

)…(school, école)

past

baisse

passe

accepter

…....…...........

Slide12

Results12

Performance improves with cost of supervision, with gaps > 10

pts

b/w some models.

Decreasing Cost of Supervision

Slide13

Cross-lingual Document Classification13

Train the Classifier

Labeled Documents in L1

Test the Classifier

Documents in L2

Training on L1

Testing on L2

Document classification model

Vectors in L1

Vectors in L2

Slide14

Results14

When transferring semantic knowledge across languages,

s

entence + word alignment information is superior to sentence or word alignment alone.

Decreasing Cost of Supervision

Decreasing Cost of Supervision

Slide15

Cross-Lingual Dependency Parsing15

Train the Parser

Treebank in L1

Test the Parser

Treebank in L2

Training on L1

Testing on L2

Parsing

model

Vectors in L1

Vectors in L2

We also consider the case when L1=L2

Slide16

Results16

When transferring syntactic knowledge across languages, using word alignments for training

embeddings

is crucial.

Word-Level Alignment

Word-Level Alignment

Slide17

ConclusionComparison of 4 representative models on several tasks.Provided insight

into relation between type of application and required form of supervision.Supervision with word-level and sentence-level alignment (almost) always superior to word-level or sentence-level alignment alone for semantic tasks.Supervision with word alignment crucial to

performance for syntactic tasks.Our experimental setup is modular and easy to use.

V

ectors

and scripts available at

github.com

/

shyamupa

/biling

-survey

17