/
Exploring differences in  two “languages” Exploring differences in  two “languages”

Exploring differences in two “languages” - PowerPoint Presentation

belinda
belinda . @belinda
Follow
0 views
Uploaded On 2024-03-15

Exploring differences in two “languages” - PPT Presentation

Issues analyzed in Kleinberg 2004 Data Stream Management 2016 with a Markov model applied for temporal analysis Presentationfigures from slides 4 on follow Monroe Colaresi and Quinn ID: 1048593

language model expect log model language log expect words 2016 odds ranking tutorial condition good http www time stopword

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Exploring differences in two “languag..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Exploring differences in two “languages”Issues analyzed in Kleinberg (2004, Data Stream Management 2016), with a Markov model applied for temporal analysis.Presentation/figures from slides 4 on follow Monroe, Colaresi and Quinn, Political Analysis (2008)1CS/INFO 6742, lightly adapted from a section of Danescu-Niculescu-Mizil and Lee Neurips 2016 tutorial, http://www.cs.cornell.edu/~cristian/index_files/NIPS_NLP_for_CSS_tutorial.pdf

2. Example application: frame competition2Example: public discussion of GMOs in food“frankenfood”“green revolution”http://www.ourbreathingplanet.com/control-the-world-through-genetically-modified-food/

3. Additional applications: Differentiating the language of ….“…”“…”successful vs. unsuccessful persuaderslanguage in one time period vs. another…your experimental condition A vs. your experimental condition B!!Also good for sanity-checking your data…3

4. Example: 106th U.S. Senate speeches on abortion... unborn children ...... murder ...Assume a joint vocabulary of terms . and p : observed relative frequency of in the blue and red samples “Frames” → words we might expect from Democrats:“Frames” → words we might expect from Republicans:… women’s rights …… privacy ...4

5. lifebornfactaarbutperformitchildmotheryouthatbekill notprocedurbabiofabortthetowomenrightsenattheiramendwomanhermyanddecisfamilidoctormakehealthforwillfriendcourtlawRanking ideaTop and bottom 20 words according to ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔✘✘✘✘important, but would be lost with stopword filtering✘✘✘ —  5

6. Aside: “stopword removal” not recommendedVery-frequent terms have been proving “increasingly” useful, e.g., for stylistic or psychological cues“a” vs “the” is surprising[for years LL assumed this was a bug, but see Language Log, Jan 3 2016: “The case of the missing determiners”]6

7. vs. count towomenrightsenattheiramendwomankill notprocedurbabiofabortthe — favors big counts, i.e., towards the righthand side of this plot (can’t have a large difference between two small differences)7

8. Ranking by log odds-ratiobankruptcsnowratificonfidentichurchschumerchosenvoterwage1974attachattornieidahosadlicoveragdjurimikulsitonightnecessarilimartinpeterlegharvestfristbrightanimtradetaughtdaytonobvious40industrichinesadmitinfant 8

9. (Move to handout: model choices)9

10. Aside: warning on ignoring (language) historyShould we really write P(vi), with no conditioning on context?Previous lectures: language accommodation/coordinationChurch 2000: “Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p / 2 than p2 “. COLING. “Finding a rare word like Noriega in a document is like lightning. We might not expect lightning to strike twice, but it happens all the time, especially for good keywords.“10

11. Ranking by z-score of log odds-ratio, with model of variance (uninformative prior)womenrightwomantheirdecisfamiliamendhersenatfriendmychoosdoctordurbinservpennsylvaniasantorumofdrnotpartialfactbirthheadyouperformbornthemotherchildabortkillprocedurbabi11

12. Ranking by z-score of log odds-ratio, with model of variance (informative prior)womenwomanrightdecisherdoctordurbinchoossantorumvpennsylvaniapregnancviabilfriendprivacitheirfamilialivdelivdrheadperformheadperformbirthhealthipartialchildbornmotherabortprocedurkillbabi12