/
Information Density and Word Order Information Density and Word Order

Information Density and Word Order - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
414 views
Uploaded On 2016-04-27

Information Density and Word Order - PPT Presentation

Why are some word orders more common than others In the majority of languages with dominant word order subjects precede objects SOVSVO gt VSO gt VOS OVS gt OSV Genetically encoded bias ID: 294982

svo sov order word sov svo word order events gestured reversible languages experiment patient shift information japanese speakers results uid case entropy

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Information Density and Word Order" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Information Density and Word OrderSlide2

Why are some word orders more common than others?

In the majority of languages (with dominant word order) subjects precede objects

(SOV,SVO) > VSO > (VOS, OVS) > OSVSlide3

Genetically encoded bias?

Single common ancestor (SOV)?

General linguistic principles

Theme-first Verb-object bodningAnimate-firstGreat, but why do these principles work?

Why are some word orders more common than others?Slide4

Constant information transmission rate

Slower for unexpected, high entropy content

Faster for predictable, low entropy content

The basic word order of a language influences the average transmission rateThus languages that are closer to the UID ideal will be more common compared to others further away from itUniform information density hypothesisSlide5

Word-order model

Simple world with

13 objects (O)

5 people8 food/drink items 2 relations (R) eat/drinkEvents in this world consist of one relation and two objects(o

1

, r, o

2

)

And appear with a certain probability

PSlide6

Base entropy (the initial state of the observer before words are spoken)

After each word, observers adjust their expectations for the following ones, reaching an entropy of zero after the third word of the event

Word-order modelSlide7

Each event has an information profile

I

1

= H0 − H1 , I2 = H2 − H1 , I

3

= H

2

Where

H

n

are entropy trajectories of each word

UID suggests a straight line from base entropy to zero entropy such that each word conveys 1/3 of the total information

Word-order modelSlide8
Slide9

Word-order model

UID deviation score

Deviation of toy-world events from the “ideal information profile” according to UID

VSO > VOS > SVO > OVS > SOV > OSVSlide10

Corpus study

Child-directed speech (English and Japanese corpora)

Utterances involving singly transitive verbs

Ignored adjectives, plurality, tense etcEnglish: VSO (0.38), SVO (0.41), VOS (0.48), SOV (0.64), OSV (0.78), OVS (0.79) Japanese: SVO (0.66), VSO (0.71), SOV (0.72), VOS (0.72), OSV (0.82), OVS (0.83) Slide11
Slide12

Experiment

Languages must be optimal with respect to the frequencies of events in the real world

Judgement

tasks for pairs of sentences (which one is more probable?)VSO (0.17), SVO (0.18), VOS (0.20), SOV (0.23), OVS (0.23), OVS (0.24). Slide13

Discussion

Object-first word orders are rare

Object-first word orders have least uniform information density

(first word carries too much information)SOV is not as compatible with the UID as it is frequent in real languages – perhaps due to other important factors beside UIDTFP and AFP favor SOV, SVO (highest ranked in the results) and VSO – perhaps UID provides some justification at least for some word order rankingsSlide14

Conclusion

Findings consistent with a

weaker

hypothesis that word order is optimal wrt the frequency speakers choose to discuss events (not wrt to how often these events really occur)UID may not provide explanation for all of the word order rankings, but does explain several aspects of the empirical distribution of word ordersSlide15

A Noisy Channel Account of

Crosslinguistic

Word Order Variation

In 96.3% of studied languages S precede OSVO (English) and SOV (Japanese) are more prevalent than VSOPeople construct sentences from and agent perspective – why SVO/SOV then?Innate universal grammar – independent of communicative or performance factorsSlide16

Why SOV/SVO

Communicative-based explanation

SOV default for the human language

Preference for S to precede OPreference for the V to appear in the end of the clauseSVO arises from SOV as a result of communication/memory pressures that sometimes outweigh the second preferenceSlide17

Shanon’s

communication theory

Comprehension and production operate via a noisy channel

Speakers are under constraints to chose utterances that will ensure maximal meaning recoverability by the listenerWhen does word order affect how easily meaning can be recovered?The girl kicks the ball. (people should adhere to SOV)

The girl kicks the boy.

(potential confusion resolved perhaps by the position of the noun

wrt

to the verb)Slide18

Method

Study

investigates whether gestured word order across languages (English-SVO, Japanese, Korean-SOV) is depending on semantic reversibility of the event

Initial bias to SOVInitial bias to native language Communicative or memory pressuresEnglish Shift to SVO (second and third factors)Japanese&Korean

Shift to SVO (only due to the third factor)Slide19

Method

Brief silent animations of intransitive/transitive events

First verbally described the animations

Then hand-gestured the meanings of the events Verbal and gesture responses were coded for the relative position of the agent, action, and patientSlide20
Slide21

Experiment 1

Animate/inanimate patients (reversible or non-reversible sentences)

More SVO word orders should be produced if reversible

Results – uniformly SVO for verbal responsesGestured S before O for animate patientsGestured V before O for human patients (as expected)Overwhelmingly gestured SOV for non-reversible eventsSlide22
Slide23

Experiment 1&2 – Japanese/Korean

English participants’ results can be explained without resorting to noisy-channel hypothesis

Participants may shift from SOV to native (SVO) due to increased ambiguity in reversible events

Thus, tested participants with a SOV native languageExpected shift to SVO in reversible eventsExperiment 2 – used more complex structuresThe old woman says that the fireman kicks the girlSlide24
Slide25

If participants use native word-order (SOV)

Then they should gesture both levels of embedded events with the same order:

S1 [S2O2V2] V1

In case of reversible events SOV creates maximal potential confusion

Then they should gesture using SVO:

S

1

V

1

[S

2

V

2

O

2

]

Experiment 1&2 – Japanese/KoreanSlide26
Slide27

Exp

1 results – native language word-order

J&K speakers verbalized patient before action (100%)

Gestured patient before action in both animate and inanimate patientsExp 2 results – shift to SVOJ speakers never verbalized SVO; K speakers rarely Both J&K speakers almost always gestured top-level verb in 2

nd

position between the top-level subject and the embedded subject

In the embedded clause patients were gestured before the action almost always, but more often in non-reversible events (both for J&K speakers)

Results predicted by noisy-channel but not by the combination of SOV default and native-language order

Experiment 1&2 – Japanese/KoreanSlide28

Experiment 3

Alternative explanation of previous results

Minimizing syntactic dependency distances

Number of words between a syntactic head (verb) and its dependents (subject and object)Shorter dependencies are easierShift from SOV to SVO given that SVO allows for shorter dependency distancesSlide29

Experiment 3 - method

Animations of a boy and a girl interacting with one of a set of objects:

Circle/star/heart which was either

Spotted/striped (surface); in a box/pail (container); wearing a top/witch’s hat (headwear)Giving/putting/intransitive eventParticipants were to gesture each event and the features of the object

If sensitive to distance b/n agent and verb, then higher SVO gesture order for longer patient descriptions

No such shift predicted by noisy channel – patient is not a possible agent of the verb, adding modifiers will not affect the recoverability of who is doing what to whomSlide30
Slide31

Gestured patient before action for most of events

Verbalized action before patient for most of events

Even with long productions still gestured patient before action, consistently with the noisy-channel hypothesis and not with the dependency-distance hypothesis

Experiment 3 - resultsSlide32

Discussion

English speakers have a strong SOV preference for non-reversible events even when the inanimate patient has up to 3 features to be gestured

SOV seems to be the preferred word order in human communication

For reversible events the preference for SOV disappears in favor of SVOAlthough SOV-natives gesture SOV in simple events, they revert to SVO for more complex onesThis shift to SVO occurs in order to maximize meaning recoverabilitySlide33

Discussion

Case marking is often used in SOV

Mitigates the confusability of subject and object, helping to retain the default SOV

If no case marking is used, then SVO shiftLarge majority of SOV languages are case marked, whereas few of SVO areUsed

location in space

as possible case marking

in the experiments

Of the case-marked gestures most had SOV order

Animacy

-dependent case marking

Many languages mark only animate direct objects

Non SVO languages have more word-order flexibility than SVO

Contain other mechanisms for disambiguation

So fixed word orders mostly SVOSlide34
Slide35

Conclusion

No need for sophisticated innate machinery to explain word-order variation

Many aspects of

crosslinguistic word-order variance are easily explained by communicative or memory pressures