/
Cross-Lingual Content Scoring Cross-Lingual Content Scoring

Cross-Lingual Content Scoring - PowerPoint Presentation

LovableLatina
LovableLatina . @LovableLatina
Follow
342 views
Uploaded On 2022-08-04

Cross-Lingual Content Scoring - PPT Presentation

Andrea Horbach Sebastian Stennmanns Torsten Zesch University DuisburgEssen Germany CrossLingual Content Scoring Motivation Core Idea Content scoring of students ID: 935278

lingual scoring cross data scoring lingual data cross content test die horbach stennmanns bea zesch 2018 training train information

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Cross-Lingual Content Scoring" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Cross-Lingual Content Scoring

Andrea Horbach, Sebastian Stennmanns, Torsten ZeschUniversity Duisburg-Essen, Germany

Slide2

Cross-Lingual Content Scoring - Motivation

Core IdeaContent scoring of students

‘ free-text answerswith

training

and test data in different languagesFoster educational equalitynon-native speaker might know the answer, but is unable to express itteachers ignore spelling and grammar for content scoringcorrectness of content not language-specificOvercome data sparsetyre-use existing training data in different language

Test Data

Training Data

Slide3

Cross-Lingual Scoring –

Core Idea

LA1: Some

additional

information you will need are the material. You also need to know the size of the container LA2: The additional information you need is one,

the amount of vinegar

you poured in each

container, two, label

the containers.

LA1000:

You

would need to know how many ml of vinegar they used, how much distilled water to rinse the samples with and how they obtained the mass of each sample.LA10001: I would need to know the exact amount of vinegar in each container.

Training Data

Test Data

The Standard

Monolingual

Content Scoring Case

train & apply

model

Question

:

After reading the groups procedure

,

describe what

additional information you would need in

order

to replicate

the experiment.

Slide4

Cross-Lingual Scoring –

Core Idea

LA1:

Some

additional information you will need are the material. You also need to know the size of the container LA2: The additional information you need

is one, the amount

of vinegar you

poured in each container,

two, label the

containers

.

LA1000: Es fehlt der Säuregehalt des Essigs. Die Menge Essig die verwendet wurde. Und welche Holzart da Holzsorten unterschiedliche Säureresistenz aufweist.LA1001: Wir müssen wissen, wie viel Wasser wir sammeln müssen, um die Probe zu machenTraining DataTest Data?Cross-Lingual Scoring

Slide5

Cross-Lingual Scoring –

Core Idea

LA1: Some

additional

information you will need are the material. You also need to know the size of the container LA2: The additional information you need is one

, the amount of vinegar

you poured in each

container, two, label

the containers.

LA1000:

Es

fehlt der Säuregehalt des Essigs. Die

Menge Essig die verwendet wurde. Und welche Holzart da Holzsorten unterschiedliche Säureresistenz aufweist.LA1001: Wir müssen wissen, wie viel Wasser wir sammeln müssen, um die Probe zu machenLA1: Einige zusätzliche Informationen, die Sie benötigen, sind das Material. Sie müssen auch die Größe des Containers kennenLA2: Die zusätzliche Information, die Sie brauchen, ist eine, die Menge an Essig, die Sie in jeden Behälter gießen, zwei, beschriften Sie die Behälter.

Training Data

Test Data

MT

train & apply model

Slide6

Basic Experimental Setup:

Using Machine Translation

Translating

Training Data

Training

Test

Translate

Training

Test

Translating

Test Data

Slide7

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

OutlineChallenges of Cross-lingual ScoringData CollectionContent Scoring Experiments

Slide8

Challenges

for Automatic Scoring Quality of machine

translation spelling errors:

translation

errors or normalization? the vinegar → der Essig but separate → getrennt the vineger → der Vineger seperate → getrennt„Translationese“ Nature of bi-lingual datasets different learner populations

language and culture dependence of prompts

If

both the

(US)-President

and

the

Vice President can no longer serve, who becomes President?vs.vs.

Slide9

Translating

Training

and

Test Data

Training

Test

Pre

-Study:

Influence

of MT Quality on Monolingual Scoring

Machine

Translation

google

translate: English to German, English to RussianDeepL: English to GermanData: 3 prompts from ASAP-2QWKContent Scoring SetupWeka SVM classifierToken n-gramsCharacter n-grams

Slide10

Collecting a

Cross-Lingual DatasetOption 1: Collecting data in two

languagesFull control over

data

collectionTime & cost-intensiveOption 2: Extend existing dataset in another languageUse existing data for EnglishRe-collect data for the same prompts in, e.g., GermanRequirements:Prompt material availableLanguage- and culture-independentCurriculum-independentScoring guidelines available/applicable

Slide11

Suitability

of

existing datasets

ASAP-2 PG Sem Mohler

-Eval &Mihalcea Prompt available? ✔ ✔ ✘ ✔ Culture independent? (✔) ✘ ✔ ✔ Curriculum independent? ✔ ✔ ✔ ✘ Scoring guidelines? ✔ ✔ ✔ ✔

Slide12

Recollecting ASAP in German

ASAP

>2000

answers

per promptUS high school students5 x ELA2 x biology3 x science

ASAP-DE300 answersper promptcrowd-sourced

3 x science

Existing data

New data

Slide13

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Dataset Comparison – Label Distribution

larger

number

of high-ranking answers in Englishrel. frequency of answers with score

Slide14

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Dataset Comparison – Answer Length

Slide15

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Dataset Comparison – Answer Length

Difference

between

learner populations!

Slide16

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Dataset Comparison – Linguistic Diversity

Slide17

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Dataset Comparison – Linguistic Diversity

Difference

partially

due to language!

Slide18

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Content Scoring Results

train

test

QWK

ENallEN0.68baselineENEN

0.61

DEDE

0.67

Slide19

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Content Scoring Results

train

test

QWK

ENallEN0.68baselineENEN

0.61

DEDE

0.67

translate both

EN

T

EN

T0.58DETDET0.66

Slide20

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Content Scoring Results

train

test

QWK

ENallEN0.68baselineENEN0.61

DE

DE

0.67translate

both

EN

T

EN

T0.58DETDET0.66translate trainENTDE0.34DETEN0.40translate testENDET

0.29

DE

EN

T

0.32

Slide21

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

Differences between Prompts

prompt

train

test

1

2

10

translate train

EN

T

DE

0.49

0.080.46DETEN0.410.390.39translate testENDET0.350.080.43DEENT0.26

0.350.33

Slide22

Horbach, Stennmanns, Zesch - Cross-Lingual Content Scoring | BEA 2018

The Influence of Translationese

(

A)

Plastic

type B was the superior in both trial 1 and trial 2. (B) Record the

weight that

was put on

to show

how

much

effected each plastic. Also conducting more trials (...)Type B plastic was the supervisor in both Trial 1 and Trial 2. (B) Write down the weight that was put

on to

show

how much

each

one

has

made plastic

. Also do

more

experiments

(...)

MT

MT

Idea

: translate test data, double translate train data

but: makes

little difference

Train

Test

Maybe combining translated and original data is the problem

Slide23

Conclusions

and Future WorkWe collected a German version

of the ASAP-2 dataset

https://

github.com

/ltl-ude/crosslingualFirst experiments on cross-lingual scoring using MTDoes not work that wellResults depend a lot on individual promptUnderstand influence factors better:LanguageLearner populationMachine translation artifactsAlternatives to Machine Translation: cross-lingual embeddings

Thank

you

! → Vielen Dank!Questions

? →. Fragen?