Translation by Collaboration among Monolingual Users

Translation by Collaboration among Monolingual Users Translation by Collaboration among Monolingual Users - Start

2017-05-19 60K 60 0 0

Description

Benjamin B. . Bederson. www.cs.umd.edu/~bederson. @. bederson. Computer Science Department. Human-Computer Interaction Lab. Institute for Advanced Computer Studies. iSchool. University of Maryland. Programmer. ID: 550195 Download Presentation

Embed code:
Download Presentation

Translation by Collaboration among Monolingual Users




Download Presentation - The PPT/PDF document "Translation by Collaboration among Monol..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentations text content in Translation by Collaboration among Monolingual Users

Slide1

Translation by Collaboration among Monolingual Users

Benjamin B. Bedersonwww.cs.umd.edu/~bederson@bedersonComputer Science DepartmentHuman-Computer Interaction LabInstitute for Advanced Computer StudiesiSchoolUniversity of Maryland

Slide2

Programmer

User

Social

Participant

Computational

Participant

Slide3

Human Computation

Things

HUMANS

can do

ThingsCOMPUTERScan do

Translation

Photo tagging

Face recognition

Human detection

Speech recognition

Text analysis

Planning

Slide4

Human Computation Taxonomy

Social

Computing

Data Mining

Collective Intelligence

Crowdsourcing

Human

Computation

Slide5

The problem of

translation

Slide6

Source: Global Reach, Internet World Stats

Languages on Internet by Population

Slide7

A

real-world

problem

Slide8

International Children’

s Digital Library

www.childrenslibrary.org

Slide9

A real-world problem: ICDL

Now:~5,000 books55 languagesSome translations in a few languages3,000 volunteer translators100K unique visitors/month

Goal:10,000 books100 languagesEvery book in every language!

www.childrenslibrary.org

Slide10

The

space

of solutions

Slide11

Machine Translation (MT)

Large volume, cheap, fast

Unreliable quality

Slide12

Professional Translators

High quality, but slow and expensive

(even for common language pairs)

Slide13

Amateur Translators

Slide14

Online Labor Markets

Slide15

The

key

idea

Slide16

Translation with the Crowd

vs. 1,200,000 contributors

Wikipedia: 900 translators

Translate with the

Monolingual

Crowd

Slide17

Quality

Speed / Affordability

Machine

Translation

Professional Bilingual Human Participation

Amateur Bilingual Human Participation

Monolingual

Human

Participation

Slide18

Monolingual collaboration

Slide19

Target Language

MT

repeat …

Source Language

Original Sentence

Translation Candidate

Crowd

Tasks:

1

Vote

2

Identify translation errors

3

Create new translation

candidates

1

Vote

3

Paraphrase source sentence

2

Explain errors

Crowd

Tasks:

New candidate

1

2

3

MT and

word alignment

MT and

word alignment

Explanation

Slide20

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

MarySees: In general, it means well, both.

MT

Slide21

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)Sees: En général, Il est à la fois de nous.

MarySees: In general, it means well, both.Edits into: In general, it is about both of us.

MT

MT

Slide22

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)Sees: En général, Il est à la fois de nous.Edits into: En général, nous nous entendons bien. (lit. In general, we get along well.)

MarySees: In general, it means well, both.Edits into: In general, it is about both of us.Sees: In general, we get along fine.

MT

MT

MT

enrichment

Slide23

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)Sees: En général, Il est à la fois de nous.Edits into: En général, nous nous entendons bien. (lit. In general, we get along well.)Sees: En général, nous sommes de bons amis.(lit. In general, we are good friends.)

MarySees: In general, it means well, both.Edits into: In general, it is about both of us.Sees: In general, we get along fine.Edits into: In general, we are good friends.

MT

MT

MT

MT

enrichment

Slide24

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)Sees: En général, Il est à la fois de nous.Edits into: En général, nous nous entendons bien. (lit. In general, we get along well.)Sees: En général, nous sommes de bons amis.(lit. In general, we are good friends.)Proposes to stop with current translation

MarySees: In general, it means well, both.Edits into: In general, it is about both of us.Sees: In general, we get along fine.Edits into: In general, we are good friends.Agrees to stop with current translation

MT

MT

MT

MT

enrichment

Slide25

Target Side - Vote

Slide26

Target Side - Identify Errors

Slide27

Target Side - Edit Translations

Slide28

Source Side – Explain Errors

Slide29

Source Side – Vote & Confirm

Slide30

What we

ve accomplished so far

Slide31

Experiment 1

60 Spanish / 22 German speakersICDL volunteersWorked on 4 Spanish books => German1 German book => Spanish

TranslateTheWorld.org

Slide32

Evaluation

2 German-Spanish bilingual evaluators

Fluency and adequacy: 5-point score

Compared Google Translate and MonoTrans2

Slide33

Results - Fluency

Slide34

Results - Fluency

Slide35

Results - Accuracy

Slide36

Results - Accuracy

Slide37

Punchline

GoogleMonoTrans2Sentences with fluency = 521112Sentences with accuracy = 517118Sentences where BOTH = 517110

Sentences for which both bilingual evaluators agree score = 5

(N=162 sentences worked on in the experiment)

Straight MT: 10% of sentences ready for prime time

MonoTrans2:

68%

of sentences ready for prime time

Slide38

Experiment 2

An alternative use case for crowdsourced translation…

Fanmi mwen nan Kafou, 24 Cote Plage, 41A bezwen manje ak dlo

Moun kwense nan Sakre Kè nan PòtoprensTi ekipman Lopital General genyen yo paka minm fè 24 èFanm gen tranche pou fè yon pitit nan Delmas 31

Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.

Slide39

My family in Carrefour, 24 Cote Plage, 41A needs food and waterPeople trapped in Sacred Heart Church, PauPGeneral Hospital has less than 24 hrs. suppliesUndergoing children delivery Delmas 31

Experiment 2

An alternative use case for crowdsourced translation…

Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.

Slide40

TranslateTheWorld.org

Slide41

Fluency Distribution

Slide42

Adequacy Distribution

Slide43

Punchline

GoogleMonoTrans2Sentences with fluency = 51 (1%)22 (30%)Sentences with adequacy = 511 (14%)29 (38%)Sentences where BOTH = 50 (0%)14 (18%)

Sentences for which both bilingual evaluators agree score = 5

(N=76 sentences completed)

Straight MT: 0% of sentences preserve all the meaning

MonoTrans2:

38%

of sentences preserve all the meaning

Slide44

Scaling Up

Slide45

Live for one week:

137,000 page views1,900 task submissions19 secs per task

Example

Slide46

Copying is the sincerest form of flattery…

Slide47

Toward a more general architecture

Joining forces with Chris Callison-Burch, Johns Hopkins University

Slide48

Take-aways

By combining

machine translation technology

human-computer interfaces

Crowdsourcing

it is possible to achieve accurate translation

without

bilingual human expertise.

Slide49

Participating Students:

Chang Hu

CS Ph.D. student

Alex QuinnCS Ph.D. studentVlad EidelmanCS Ph.D. studentYakov KronrodLinguistics Ph.D. studentOlivia BuzekCS/Linguistics undergrad

TranslateTheWorld.org

Philip Resnik

Professor

Linguistics

Institute of Advanced

Computer Studies

Slide50


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.