/
Intersecting Cognitive Linguistics and Intersecting Cognitive Linguistics and

Intersecting Cognitive Linguistics and - PowerPoint Presentation

debby-jeon
debby-jeon . @debby-jeon
Follow
345 views
Uploaded On 2019-06-26

Intersecting Cognitive Linguistics and - PPT Presentation

Word Vector s to Take Figurative Language to New Heights Do or Do Not There Is No Try DiscourseLevel Style in Quotations Kyle Booten Andrea Gagliano Emily Paul Marti Hearst ID: 760364

amp words model word words amp word model storm intersection sentence similarity surrendering quotations features anchor word2vec quotes metaphor

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Intersecting Cognitive Linguistics and" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Intersecting Cognitive Linguistics and

Word Vectors to Take Figurative Language to New HeightsDo or Do Not. There Is No Try:Discourse-Level Style in Quotations

Kyle Booten,

Andrea

Gagliano, Emily Paul,

Marti

Hearst

UC

Berkeley

Google Research, Oct 24, 2017

Slide2

Motivation:

Support Creativity

Slide3

eye

sun

Slide4

eye

sun

orb

What would Shakespeare do?

Slide5

Combine Two Ideas from Cognitive Linguistics

Family Resemblances

Every day Metaphor

Slide6

Wittgensteinian family resemblances

Rosch and Mervis. 1975. Family resemblances: Studies in the internal structure of categories.

games

b

oard games

ball games

video games

golf

c

ard games

Common features, but no single unifying attribute

sun

solar

sphere

heat

light

orbit

orange

flame

Slide7

In the middle of life’s road,I found myself in a dark wood.—Dante, Divine Comedy

Metaphor in poetry

And all our yesterdays have lighted foolsThe way to dusty death.—Shakespeare, Macbeth

Metaphor used: Life is a Journey

Lakoff

and Turner. 1989. More than cool reason: A field guide to poetic metaphor.

Slide8

Metaphor “mak[es] use of structure imported from a completely different conceptual domain” —Lakoff and Turner

Lakoff

and Turner. 1989. More than cool reason: A field guide to poetic metaphor.

Slide9

Blending of Semantic Spaces

In the middle of life’s road,I found myself in a dark wood.—Dante, Divine Comedy

life

road

Distinct semantic spaces

journey quest

way

. . .

Our goal: computationally suggest words to create a figurative relationship

Lakoff

and Turner. 1989. More than cool reason: A field guide to poetic metaphor.

Fauconnier and Turner. 2008. The way we think: Conceptual blending and the mind’s hidden complexities.

Dante poetically describes middle age as being lost in a wood while traveling down life’s path.

Slide10

Some recent work on computational metaphor

“Surgeons are butchers”

Veale et al. (2000)Model based on graph isomorphism with slippage across analogical relations.

Veale

, O’Donoghue, and Keane. 2000. Computation and blending.

Slide11

Recent work on computational metaphor

Search query:“as adj as a | an N” ->“As hot as an oven”“Solemn” yields related words: {monument, judge, owl, funeral, etc}

Veale and Hao (2007 & 2008)Develops a case base of similes via web search

Veale

and Hao. 2007. Comprehending and generating apt metaphors: a webdriven, case-based approach to figurative language.; Veale and Hao. 2008. A fluid knowledge representation for understanding and generating creative metaphors.

Slide12

Recent work on computational metaphor

Harmon (2015)Builds a sentence based on meaning relations between two nounsRanks according to conventionalityPrefers words that are not conceptually similar

Harmon

. 2015. Figure8: A novel system for generating and evaluating figurative language.

Slide13

Our Idea:

Using

word embeddings as

family resemblances

to blend

semantic

spaces

.

Slide14

Mikolov, Yih, Corrado, and Dean. 2013c. Linguistic regularities in continuous space word representations.

Word2vec word embeddings

The king ruled over his land.

context

Conversion Step

Slide15

Using word embeddings for family resemblances

Schütze. 1993. Word space.; Rosch and Mervis. 1975. Family resemblances: Studies in the internal structure of categories.

Slide16

Semantic similarity from Word Embeddings

Mikolov, Chen, Corrado, Dean, and Jurafsky. 2013a. Efficient estimation of word representations in vector space.; Mikolov, Sutskever, Chen, Corrado, and Dean. 2013b. Distributed representations of words and phrases and their compositionality.; Mikolov, Yih, Corrado, and Dean. 2013c. Linguistic regularities in continuous space word representations.

Slide17

Idea: use word embeddings for family resemblances

life

family

living

everyday

humanity

childhood

society

motherhood

No single unifying attribute

Slide18

Mikolov, Chen, Corrado, Dean, and Jurafsky. 2013a. Efficient estimation of word representations in vector space.; Mikolov, Sutskever, Chen, Corrado, and Dean. 2013b. Distributed representations of words and phrases and their compositionality.; Mikolov, Yih, Corrado, and Dean. 2013c. Linguistic regularities in continuous space word representations.

Standard word2vec usage

Slide19

Standard word2vec usage

?

Slide20

Standard word2vec usage

Slide21

Standard word2vec usage

Slide22

Standard word2vec usage

Slide23

Standard word2vec usage

Slide24

Standard word2vec usage

Slide25

Standard word2vec usage

Slide26

Example: surrendering & storm

Anchor words

Connector word

?

surrendering

storm

What words can

you

think of?

Slide27

Using word2vec to approximate intersection of Lakoff & Turner metaphor frames

?

surrendering

losing

giving

yielding

conceding

relinquishing

storm

hurricane

tornado

squall

snowstorm

typhoon

Wittgenstein family resemblance

from semantically close vectors.

Lakoff and Turner importing from

different

conceptual domains

Slide28

Baseline (Addition) Model vector representation

Slide29

Baseline (Addition) Model vector representation

Slide30

Baseline (Addition) Model vector representation

Slide31

Sample output of Baseline (Addition) Model

hurricanetornadodelugefloodingdownpourrainrainstormtwistersquall...

Words most similar to:

Slide32

Using word2vec to approximate intersection of Lakoff & Turner metaphor frames

?

surrendering

losing

giving

yielding

conceding

relinquishing

storm

hurricane

tornado

squall

snowstorm

typhoon

Wittgenstein family resemblance

from semantically close vectors.

Lakoff and Turner importing from

different

conceptual domains

Slide33

eye

sun

Slide34

eye

sun

orb

Slide35

storm

hurricane

tornado

squall

snowstorm

typhoon

surrendering

losing

giving

yielding

conceding

relinquishing

NEW:

Intersection

Model

in word2vec

n = 1000

n = 1000

Slide36

storm

hurricane

tornado

squall

snowstorm

typhoon

surrendering

losing

giving

yielding

conceding

relinquishing

NEW:

Intersection

Model

in word2vec

n = 1000

n = 1000

barrage

dissipating

onslaught

...

Slide37

Intersection Model

vector representation

Slide38

Intersection Model

vector representation

Slide39

Intersection Model

vector representation

Slide40

Sample output of Intersection Model

relinquishingyieldinggivingconcedinglosing

hurricane snowstormtornadosqualltyphoon

Words most similar to:

Slide41

Sample output of Intersection Model

relinquishingyieldinggivingconcedinglosing… …barrage (n = 932)…

hurricane snowstormtornadosqualltyphoon...barrage (n = 288)………

Words most similar to:

Slide42

Observing differences between models

Setup of example word pairsQuantitative observationsQualitative observations

Slide43

Setup: Generate word pairs for observation

Poetic theme and concrete noun selected because Kao and Jurafsky (2015) found that professional poetry contains more concreteness

Anchor words

Connector word

Kao

and Jurafsky. 2015. A computational analysis of poetic style:

Imagism

and its influence on modern professional and amateur poetry.

Concrete noun

Poetic theme

?

Slide44

Example: surrendering & storm

Anchor words

Connector word

?

surrendering

storm

Slide45

Example: Connector words for surrendering & storm

Unique to Baseline Modelsqualltornadotyphoonsnowstormfloodingrainstormdelugehurricane...

Unique to Intersection Modelonslaughtstrandingblowingdissipatingbarrageregroupedbatteroutburst...

For any sets Baseline and Intersection of similar size, there are likely to be words unique to each set, as well as words shared between these sets; a proof is in the paper.

Slide46

Observing differences between models

Step up of example word pairsQuantitative observationsQualitative observations

Slide47

Unique to Intersection ModelSimilarity to stormSimilarity to surrenderingonslaught.30.20stranding.27.28blowing.24.29dissipating.23.22barrage.25.20regrouped.19.31batter.22.25outburst.21.20...

Average spread between similarity scores = 0.05

Unique to Baseline ModelSimilarity to stormSimilarity to surrenderingsquall.63-.03tornado.64-.02typhoon.62-.01snowstorm.64.01flooding.57.01rainstorm.57.07deluge.50.08hurricane.73.04...

Average spread between similarity scores = 0.56

Observing quantitative differences of connector words

Slide48

Balanced band of similarity

Anchor word pairs

Range of average similarity from words in

Baseline Model

to anchor words

Range of average similarity from words in

Intersection Model

to anchor words

flame & caring

0.13 – 0.58

0.22 – 0.30

color & earthly

0.17 – 0.55

0.28 – 0.32

hair & anguish

0.14 – 0.66

0.27 – 0.33

flame & killing

0.09 – 0.55

0.23 – 0.26

mouth & compassion

0.16 – 0.54

0.25 – 0.29

storm & surrendering

0.03 – 0.69

0.21 – 0.26

ring & mankind

0.11 – 0.57

0.21 – 0.34

...

Slide49

Anchor word pairsRange of average similarity from words in Baseline Model to anchor wordsRange of average similarity from words in Intersection Model to anchor wordsflame & caring0.13 – 0.580.22 – 0.30color & earthly0.17 – 0.550.28 – 0.32hair & anguish0.14 – 0.660.27 – 0.33flame & killing0.09 – 0.550.23 – 0.26mouth & compassion0.16 – 0.540.25 – 0.29storm & surrendering0.03 – 0.690.21 – 0.26ring & mankind0.11 – 0.570.21 – 0.34...

Balanced band of similarity

Band of similarity

~

0.25 to

~

.30

Slide50

Observing differences between models

Step up of example word pairsQuantitative observationsQualitative observations

Slide51

Setup: Construct dataset of figurative relationships

List Bonslaughtstrandingblowingdissipatingbarrageregroupedbatteroutburst...

List Asqualltornadotyphoonsnowstormfloodingrainstormdelugehurricane...

OR

Mechanical Turkers were shown one of these lists and asked to select the word that “best connects the anchor words in a poetic sense (e.g. using a double meaning, creating a new image, creating an interesting relationship, etc.)”

?

surrendering

storm

Slide52

Setup: Construct dataset of figurative relationships (continued)

Mechanical Turkers set A first picked the best word from each list.Set B then completed this sentence to draw the connection:“Barrage connects storm and surrendering because ___________”These were manually assessed to see how figurative their language is and if they blended semantic spaces.

surrendering

storm

Selected: barrage

Slide53

Observing qualitative differences of connector words

“Hurricane connects storm and surrendering because it is a type of storm and those who surrender to it are spared, like grass and those who stand against it are devastated, like big trees.”“Barrage connects storm and surrendering because a storm is a barrage of bad weather like winds and rain people surrender when they feel a barrage of overwhelming things coming at them.”

.04

.20

.25

.73

Intersection connector word

Baseline connector word

surrendering

surrendering

hurricane

barrage

storm

storm

Slide54

Observing qualitative differences of connector words

“Hues connects color and earthly because hues imply various colors, shades, or characteristics and hues can be earthly in tone, such as blues, greens and browns.”“Radiant connects color and earthly because radiant means a bright color that looks like it’s shining and at night, the earthly sky is radiant because it shines brightly with the stars.”

hues

.

09

.

22

.29

.

61

earthly

radiant

earthly

color

color

radiant

Intersection connector word

Baseline connector word

Slide55

Conclusion

Slide56

Conclusion

Balanced similarity scores leads to heightened effects.

Intersection model may achieve this more reliably:

=

Band of similarity

~0.25 to ~.30

Concrete noun

Poetic theme

?

Slide57

Contributions

Word embeddings combined with classic cognitive linguistics to generate figurative language.

New way to compute using word2vec representation

Observations about balance of similarity scores

life

family

living

everyday

humanity

childhood

motherhood

Anchor word pairs

Range of average similarity from words in

Baseline Model

to anchor words

Range of average similarity from words in

Intersection Model

to anchor words

flame & caring

0.13 – 0.58

0.22 – 0.30

color & earthly

0.17 – 0.55

0.28 – 0.32

hair & anguish

0.14 – 0.66

0.27 – 0.33

Slide58

Questions/Future Work

Hypothesis testing / evaluation around the band of similarity

Threshold testing on size of semantic space

n = ?

Input from poets vs. Mechanical Turkers

Stronger grounding in Lakoff & Turner metaphorical representation

Reassess anchor words of poetic theme and concrete noun

?

?

Anchor words

Band of similarity

~

0.25 to

~

.30

Slide59

Intersecting Cognitive Linguistics and

Word Vectors to Take Figurative Language to New HeightsDo or Do Not. There Is No Try:Discourse-Level Style in Quotations

Kyle Booten, Andrea Gagliano, Emily Paul, Marti HearstUC BerkeleyGoogle Research, Oct 24, 2017

Best short paper runner-up, NAACL 2016

Slide60

What is the role of literary discourse in the age of social media?

Slide61

Question: what motivates people to choose the quotes they do to post on social media?

This question motivated us to investigate what is special about quotations vs prose.

Overall finding

: part of what unites quotes as a genre is their latent stylistic patterns.

Slide62

Related work

Danescu-Niculescu-Mizil

et al (2012)

, “You had me at hello” used features to distinguish popular movie quotations from unmemorable lines from the same film.

The found that popular quotations tended to use less common words, but these were placed into more common syntactic contexts.

They hint at “common syntactic scaffolding”, which we try to uncover here.

They also find that quotations tend to contain linguistic features that make them more “generalizable”, such as tendency toward indefinite over definite articles.

Slide63

Related work

Guerini

et al. (2015)

found that memorable quotes were more euphonic, with more instances of rhyme and alliteration than non-memorable counterparts.

Louis and

Nenkova

(2012)

, studying text coherence, found that certain types of sentences (in terms of syntactic structure) tend to follow certain other types of sentences.

They also demonstrated that sentences with similar communicate purposes are syntactically similar.

Slide64

Slide65

Slide66

Slide67

Tumblr Statistics

(In 2014)16th most popular site in US5th most popular social network>160M users

Chang et al. 2014. What is

tumblr

: A statistical overview and comparison. KDD 16(1)

Slide68

The Ancient Arts of Rhetoric

Phonetic patterns: rhyme;alliteration

Rhetoric: ways of wielding language to make it persuasive or memorable.

A predecessor to modern linguistics was the description of rhetorical “tropes” or “figures”.

Syntactic patterns:

epistrophe

: successive clauses end with the same words;

p

ysma

: the speaker launches a series of sharp and vehement questions.

Slide69

Idea: Focus on Two-Sentence Quotations

Compare first sentence to second sentence.

See if there is a difference between quotations and prose pairs.

Slide70

Tumblr Quotes seem to have certain “styles”

“You cannot observe people through an ideology. Your ideology observes for you.”-- Philip Roth

Syntactic patterns:

A

ntimetabloe

:

:

words in the first clause appear reversed in the second.

Negative statement (cannot) followed by a positive one.

First sentence begins with generic “you”.

Slide71

Tumblr Quotes seem to have certain “styles”

“You cannot observe people through an ideology. Your ideology observes for you.”-- Philip Roth

Syntactic patterns:Antimetabloe:: words in the first clause appear reversed in the second.Negative statement (cannot) preceded by a positive one.(Do. Or do not.)First sentence begins with generic “you”.

“Great things are done when men and mountains meet.

This is not done by jostling in the street.”

-- William Black

Slide72

Other Features Observed and Detected

“Forgotten is forgiven.”

Syntactic features:

High-level syntax (from Feng et

al’s

2012 authorship detection work which they argue provides an interpretable representation of the sentence structure):

NP + VP + .

Slide73

Other Features Observed and Detected

“Peace comes from within. Do not seek it without.”

Lexical features:

General/abstract words like “peace”.

Computed as “is most common sense of head word within 5 hops of

WordNet

synset

Abstraction.n.06?”

Similar for General.

Slide74

Datasets

Quotations 1: Tumblr posts marked with the quote type and #quotations hashtag and precisely 2 sentences long. (N=4237) TrainingQuotations 2: Tumblr posts marked with the quote type and #quote hashtag and precisely 2 sentences long. (N=1846) Testing

Non-quotes 1:

Random 2-sentence long paragraphs from the Brown corpus (N=1846)

Non-quotes 2:

Randomly chosen 2-sentence

sequendes

from longer paragraphs in Brown (N=1846)

Slide75

Which Features Preferentially Occur in Sentence 1 in Quotation Corpus?

Lexical features:“n’t” and “not”, signifying that negative sentences were more likely to occur in the first sentence of a quotation Similarly, “never”, as well as “do”

Do not worry about your difficulties in mathe- matics. I can assure you mine are still greater. (A. Einstein)

We are not human beings having a spiritual experience. We are spiritual beings having a human experience. (P. de

Chardin

)

Slide76

Which Features Preferentially Occur in Sentence 1 in Quotation Corpus?

Syntactic features:QuestionsWhen“Sweeping Declarations”

Love is a trap. When it appears, we see only its light, not its shadows. (P. Coelho)

Slide77

Which Features Preferentially Occur in Sentence 2 in Quotation Corpus?

Syntactic features:CC + NP + VP + .ButAndThis is not surprising given the role of coordinating conjunctions, but it was 6 times more likely for Quotes than for Non-Quotes 2 comparison corpus.IN + NP + VP + .

Where a goat can go, a man can go. And where a man can go, he can drag a gun. (William Phillips)

One of

the

most adventurous things left us is

to go to bed. For no one can lay a hand on our dreams. (E.V. Lucas)

Slide78

Which Features Preferentially Occur in Sentence 2 in Quotation Corpus?

Lexical features:Simply: a specific rhetorical pattern that emphasizes the second sentence’s proposition with respect to the first’s.

I used to dream about escaping my ordinary life, but my life was never ordinary. I had

simply

failed to notice how extraordinary it was. (R. Riggs)

Slide79

Quote Sentence Ordering Task

Training: see quotes with sentences either in order or reversed (S1 S2 or S2 S1).

Testing

: try to predict the order.

Hypothesis

: can better predict the order of quotations than prose.

Slide80

Datasets

Quotations 1: Tumblr posts marked with the quote type and #quotations hashtag and precisely 2 sentences long. (N=4237) TrainingQuotations 2: Tumblr posts marked with the quote type and #quote hashtag and precisely 2 sentences long. (N=1846) Testing

Non-quotes 1:

Random 2-sentence long paragraphs from the Brown corpus (N=1846)

Non-quotes 2:

Randomly chosen 2-sentence

sequendes

from longer paragraphs in Brown (N=1846) T

Slide81

Quote Sentence Ordering Task

Training: see quotes with sentences either in order or reversed (S1 S2 or S2 S1).

Testing: try to predict the order.Hypothesis: can better predict the order of quotations than prose.

Significant, two-tailed t-test, p<.01

This is evidence that quotations as a genre are more “formulaic” than other textual sequences, their order more easily

predicted.

Confirmed!

Slide82

Conclusions

This stylistic patterning may be especially strong in quotations.

Analyzed linguistic style not merely as the presence of features, but also their order across sentences.

In quotations, certain words as well as categories of words and syntactic patterns are more likely to appear in the first or second of a pair of two-sentence texts.

Slide83

Another Project:NewsLens

Philippe Laban, Marti Hearst,

ACL 2017 Workshop on Events and Stories in the News

Slide84

Slide85

Slide86

http://

newslens.berkeley.edu

/lanes

/

Slide87

A Major ACL Initiative:

arXiv

/ Preprint Policy

Slide88

Highlights of the NewACL Policies for Submission, Review, & Citation

https://www.aclweb.org/adminwiki/index.php?title=ACL_Policies_for_Submission,_Review_and_Citation

Submission

Review / Citation

*ACL and TACL submissions must be anonymousA submission is not considered anonymous if posted to a preprint server within an anonymity periodNon-anonymized submissions before the anonymity period are allowed, but discouraged

When reviewing, read the paper first and form an opinion before any searching.

For citation, refereed citations take priority over preprints

Papers (refereed or not) that appear within 3 months of submission should be considered contemporaneous

Slide89

Appendices

Slide90

Slide91

Proof

Starting with the set A, from the Addition Model:

Alpha is the minimum similarity threshold resulting from selecting top n.

Slide92

Proof (continued)

The set I, from the Intersection Model:

Beta and Gamma are the minimum similarity threshold resulting from selecting top n.

Slide93

Proof (continued)

If we were finding the single word vector that maximized (1) and (2), the two equations would be equivalent (Levy and Goldberg 2014), such that (3) would need to be satisfied:

Note, (3) assumes the vectors are length normalized.

Slide94

Proof (continued)

We then expand (3) as follows:

Slide95

Proof (continued)

We can then solve (4) as follows:

But, (5) is not necessarily always true. Thus, the initial assumption that the intersection and addition models contain the same word vectors is contradicted. This confirms that the set A does not equal the set I.

Slide96

Poetic themes and concrete nouns

Poetic themesloss melancholy angeranimals calmness compassionconfusion death envy faith fear forgiveness freedom friendship godgrace gratitude griefhate hope immortalityjealousy joy lifemothers nature peacepeople religion remembrance...

Concrete nouns

bed ear finger

horse sand hair

bell grass rock

book rose breast

ship blood window

wing girl snow

wood ring body

room wine ground

mouth garden stone

storm brain flame

...

Slide97

Anchor word pairs selected

flame & caring storm & surrendering

color & earthly ring & mankind

hair & anguish hair & envied

flame & killing book & liberties

mouth & compassion town & grieving

Slide98

eye

sun

Slide99

eye

sun