/
Distributional models Distributional models

Distributional models - PowerPoint Presentation

marina-yarberry
marina-yarberry . @marina-yarberry
Follow
402 views
Uploaded On 2017-10-28

Distributional models - PPT Presentation

Katrin Erk You can get an idea of what a word means from observing it in context He filled the wampimuk passed it around and we all drank some We found a little hairy wampimuk sleeping behind a tree ID: 600471

words word distributional fall word words fall distributional apple contexts eat count corpus context meaning similarity background peel fruit orange tree counts

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Distributional models" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Distributional models

Katrin

ErkSlide2

You can get an idea of what a word means from observing it in context

He filled the

wampimuk

, passed it around, and we all drank someWe found a little hairy wampimuk sleeping behind a tree. (examples by Marco Baroni)Distributional modeling:Represent the meaning of a word through the contexts in which it is observedSimilar words appear in similar contexts

2Slide3

What words can appear in these contexts?

Word 1

: drown, bathroom, shower, fill, fall, lie, electrocute, toilet, whirlpool, iron, gin

Word 2:

eat, fall, pick, slice, peel, tree, throw, fruit, pie, bite, crab, grate

Word 3

: advocate, overthrow, establish, citizen, ideal, representative, dictatorship, campaign, bastion, freedom

Word 4

: spend, enjoy, remember, last, pass, end, die, happen, brighten, relive Slide4

What words can appear in these contexts?

Word 1

: drown, bathroom, shower, fill, fall, lie, electrocute, toilet, whirlpool, iron, gin

Word 2:

eat, fall, pick, slice, peel, tree, throw, fruit, pie, bite, crab, grate

Word 3

: advocate, overthrow, establish, citizen, ideal, representative, dictatorship, campaign, bastion, freedom

Word 4

: spend, enjoy, remember, last, pass, end, die, happen, brighten, relive

bathtub

apple

democracy

daySlide5

What can you say about word number 5?

Word 1

: drown, bathroom, shower, fill, fall, lie, electrocute, toilet, whirlpool, iron, gin

Word 2:

eat, fall, ripe, slice, peel, tree, throw, fruit, pie, bite, crab, grate

Word 3

: advocate, overthrow, establish, citizen, ideal, representative, dictatorship, campaign, bastion, freedom

Word 4

: spend, enjoy, remember, last, pass, end, die, happen, brighten, relive

bathtub

apple

democracy

day

Word 5

: eat, paint, peel, apple, fruit, juice, lemon, blue, growSlide6

What can you say about word number 5?

Word 1

: drown, bathroom, shower, fill, fall, lie, electrocute, toilet, whirlpool, iron, gin

Word 2:

eat, fall, ripe, slice, peel, tree, throw, fruit, pie, bite, crab, grate

Word 3

: advocate, overthrow, establish, citizen, ideal, representative, dictatorship, campaign, bastion, freedom

Word 4

: spend, enjoy, remember, last, pass, end, die, happen, brighten, relive

bathtub

apple

democracy

day

Word 5

:

eat

, paint, peel, apple, fruit,

juice, lemon

, blue, grow

orangeSlide7

Describing meaning through contextSimilar words appear in similar contexts

apple, orange

Measure similarity in meaning as similarity in contexts

Caveat: If a word has multiple meanings, like “orange”, it will appear in a mixture of contextsHow to describe the contexts of a word?Count other words nearbySlide8

Counting context

words for “apple”

They

picked up red apples that had fallen to the groundEating apples is healthyWord count, 3-word context window, lemmatized

She

ate

a red applePick an apple.

abeeatfallhavehealthypickredthatup2121112

211Slide9

How can we compare two context word counts?

eat

fall

ripeslicepeel

tree

throw

fruitpiebitecrab794244

4722120816014515610910488Count how often “apple” occurs close to other words in a large text collection (corpus):Interpret counts as coordinates:

falleatapple

Every context word

becomes a dimension.Slide10

How can we compare two context word counts?

eat

fall

ripeslicepeeltree

throw

fruit

piebitecrab79424447

22120816014515610910488Count how often “apple” occurs close to other words in a large text collection (corpus):Do the same for “orange”:

eatfallripeslicepeeltreethrowfruitpiebitecrab2652225622206474

111

4

4

8Slide11

How can we compare two context word counts?

eat

fall

ripeslicepeeltree

throw

fruit

piebitecrab79424447

22120816014515610910488Then visualize both count tables as vectors in the same space:eatfallripe

slicepeeltreethrowfruitpiebitecrab265222562220647411144

8

fall

eat

apple

orange

Similarity between

two words as

proximity in spaceSlide12

How can we compare two context word counts?

eat

fall

ripeslicepeeltree

throw

fruit

piebitecrab79424447

22120816014515610910488Then visualize both count tables as vectors in the same space:eatfallripe

slicepeeltreethrowfruitpiebitecrab265222562220647411144

8

fall

eat

apple

orange

Similarity between

two words as

proximity in spaceSlide13

What do we mean by “similarity” of vectors?

Euclidean distance (a dissimilarity measure!):

orange

appleSlide14

Problem with Euclidean distance: very sensitive to word frequency!

Braeburn

appleSlide15

What do we mean by “similarity” of vectors?

Cosine similarity:

orange

apple

Use angle between vectors

instead of point distance

to get around word

frequency issuesSlide16

Using distributional modelsPredicting word similarity (WordSim353):

doctor nurse 7.00

professor doctor 6.62

student professor 6.81smart student 4.62smart stupid 5.81Doing the TOEFL test:provisionsstipulationsinterrelationsjurisdictions

interpretations

Similar:

synonymsantonymstopically related wordsSlide17

Using distributional modelsPredicting

priming effects

free associations

Finding (near-)synonyms: automatically building a thesaurusRelated: use distributional similarity of documents (containing similar words) in Information RetrievalSlide18

Corpora in which to count words

Corpus = text collection

What works best for computing

a distributional model?“Moby Dick”2 years of the Wall Street Journal A collection of dating adsA collection of webpage textsSlide19

Corpora in which to count wordsNeed to be available electronically

Best possible match for today’s English language in general

Mixture of genres

Mixture of authorsSpoken and writtenLarger is betterSlide20

Corpora in which to count (English) words Brown Corpus

1 million words

Balanced corpus, mixture of genres

British National Corpus100 million wordsBalanced corpus, mixture of genres, spoken and writtenSlide21

Corpora in which to count (English) words English

Gigaword

corpus

1 billion wordsShort news articlesWikipedia dump2 billion wordsUKWaC (UK web as corpus)2 billion wordsCollection of webpages ending in .ukSlide22

What can we do with our word counts?… blackboardSlide23

Background in philosophy of language: Wittgenstein, “meaning” as “use”

Man

kan

für eine große Klasse von Fällen der Benützung des Wortes ‘Bedeutung’ – wenn auch nicht für

alle

Fälle seiner Benützung – dieses Wort so erklären: Die Bedeutung eines Wortes ist sein Gebrauch in der Sprache

. -- Wittgenstein, Philosophical InvestigationsFor a large class of cases – though not for all – in which we employ the word ‘meaning’ it can be explained thus: the meaning of a word is its use in the language. (translation: Anscombe/Stokhof)Slide24

Background in linguistics: Harris and FirthZellig

Harris (1957): Classify linguistic units by observing the contexts they occur in

phonemes

morphemesphrase types: noun phrase, verb phrase…Not specifically about semanticsJohn Firth (1957): “collocations”identify senses of a word by looking at groups of contexts in which it appears“You shall know a word by the company it keeps”Slide25

Background in psychologyLandauer

/

Dumais

1998: “A solution to Plato’s problem”How come you know so many words?“A typical American seventh grader knows the meaning of 10-15 words today that she did not know yesterday. She must have acquired most of them as a result of reading because (a) the majority of English words are used only in print, (b) she already knew well almost all the words she would have encoun- tered in speech, and (c) she learned less than one word by direct instruction.”Slide26

Background in psychologyMany phenomena to do with word meaning can be simulated with distributional models

Word similarity ratings

(WordSim353):

doctor nurse 7.00professor doctor 6.62student professor 6.81smart student 4.62smart stupid 5.81How would you simulate this with a distributional model?Slide27

Background in psychologyMany phenomena to do with word meaning can be simulated with distributional models

Priming

Hearing one word makes you react faster to a related word

Hodgson data:election – votedove – peacevase – flowerNOT: election – peace How would you simulate that with a distributional model?Slide28

Background in psychologyAre our mental representations of words, of concepts distributional?

Do they represent the contexts in which a word has been encountered?

Does the meaning of a word depend on the contexts in which it is used? Slide29

Background in psychologyConcepts as distributional:

Yes

(

Landauer and others):How else would we learn so many words?Text as one of the main ways in which we encounter wordsNo (Barsalou and others):Perception is central to how we represent conceptsDistributional and perceptual (Andrews/Vigliocco/Vinson and others)Both types of information seems to be relevant

Simulations: Combining both makes for a

better model