/
Two Sisters Reunited After 18 Years in Checkout Counter Two Sisters Reunited After 18 Years in Checkout Counter

Two Sisters Reunited After 18 Years in Checkout Counter - PowerPoint Presentation

mila-milly
mila-milly . @mila-milly
Follow
0 views
Uploaded On 2024-03-13

Two Sisters Reunited After 18 Years in Checkout Counter - PPT Presentation

Deer Kill 130000 Prostitutes Appeal to Pope Man Struck by Lightning Faces Battery Charge Milk Drinkers Are Turning to Powder Doctor Who Aided Bin Laden Raid In ID: 1047404

parasitic probability sentence true probability parasitic true sentence gap values false put nlppass random engpass joint distribution set coin

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Two Sisters Reunited After 18 Years in C..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Two Sisters Reunited After 18 Years in Checkout CounterDeer Kill 130,000Prostitutes Appeal to PopeMan Struck by Lightning Faces Battery ChargeMilk Drinkers Are Turning to PowderDoctor Who Aided Bin Laden Raid In JailEnraged Cow Injures Farmer with AxeEye Drops Off ShelfDrunk Gets Nine Months in Violin CaseStolen Paining Found by TreeInclude Your Children When Baking CookiesAugust Sales Fall for GM as Trucks Lift ChryslerBig Rig Carrying Fruit Crashes on 210 Freeway, Creates JamSquad Helps Dog Bite VictimsTwo Foot Ferries May Be Headed to TrinidadAstronaut Takes Blame for Gas in SpacecraftReagan Wins on Budget, but More Lies AheadDealers Will Hear Car Talk at Noon

2. ProbabilityDavid KauchakCS159 – Fall 2014

3. AdminAssignment advicetest individual components of your regex first, then put them all togetherwrite test casesAssignment deadlinesClass participation

4. Assignment 0What do you think the average, median, min and max were for the “programming proficiency” question (1-10)?

5. Assignment 0Mean/median: 6.5

6. Assignment 0: programing languagesPython 9Java 8Haskell 1Ruby 1Matlab 1SML 1C++ 1

7. Regex revisited

8. Corpus statisticswww.nytimes.com 1/25/2011

9. Why probability?Prostitutes Appeal to PopeLanguage is ambiguousProbability theory gives us a tool to model this ambiguity in reasonable ways.

10. Basic Probability Theory: terminologyAn experiment has a set of potential outcomes, e.g., throw a dice, “look at” another sentenceThe sample space of an experiment is the set of all possible outcomes, e.g., {1, 2, 3, 4, 5, 6}In NLP our sample spaces tend to be very largeAll words, bigrams, 5-gramsAll sentences of length 20 (given a finite vocabulary)All sentencesAll parse trees over a given sentence

11. Basic Probability Theory: terminologyAn event is a subset of the sample spaceDice rolls{2}{3, 6}even = {2, 4, 6}odd = {1, 3, 5}NLPa particular word/part of speech occurring in a sentencea particular topic discussed (politics, sports)sentence with a parasitic gappick your favorite phenomena…

12. EventsWe’re interested in probabilities of eventsp({2})p(even)p(odd)p(parasitic gap)p(first word in a sentence is “banana”)

13. Random variablesA random variable is a mapping from the sample space to a number (think events)It represents all the possible values of something we want to measure in an experimentFor example, random variable, X, could be the number of heads for a coin tossed three timesReally for notational convenience, since the event space can sometimes be irregularspaceHHHHHTHTHHTTTHHTHTTTHTTTX32212110

14. Random variablesWe can then talk about the probability of the different values of a random variableThe definition of probabilities over all of the possible values of a random variable defines a probability distribution spaceHHHHHTHTHHTTTHHTHTTTHTTTX32212110XP(X)3P(X=3) = ?2P(X=2) = ?1P(X=1) = ?0P(X=0) = ?

15. Random variablesWe can then talk about the probability of the different values of a random variableThe definition of probabilities over all of the possible values of a random variable defines a probability distribution spaceHHHHHTHTHHTTTHHTHTTTHTTTX32212110XP(X)3P(X=3) = 1/82P(X=2) = 3/81P(X=1) = 3/80P(X=0) = 1/8

16. Probability distributionTo be explicitA probability distribution assigns probability values to all possible values of a random variableThese values must be >= 0 and <= 1These values must sum to 1 for all possible values of the random variableXP(X)3P(X=3) = 1/22P(X=2) = 1/21P(X=1) = 1/20P(X=0) = 1/2XP(X)3P(X=3) = -12P(X=2) = 21P(X=1) = 00P(X=0) = 0

17. Unconditional/prior probabilitySimplest form of probability distribution isP(X)Prior probability: without any additional information:What is the probability of heads on a coin toss?What is the probability of a sentence containing a pronoun?What is the probability of a sentence containing the word “banana”?What is the probability of a document discussing politics?…

18. Prior probabilityWhat is the probability of getting HHH for three coin tosses, assuming a fair coin?What is the probability of getting THT for three coin tosses, assuming a fair coin?1/81/8

19. Joint distributionWe can also talk about probability distributions over multiple variablesP(X,Y) probability of X and Ya distribution over the cross product of possible valuesNLPPassP(NLPPass)true0.89false0.11EngPassP(EngPass)true0.92false0.08NLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07

20. Joint distributionStill a probability distributionall values between 0 and 1, inclusiveall values sum to 1All questions/probabilities of the two variables can be calculated from the joint distributionNLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07What is P(ENGPass)?

21. Joint distributionStill a probability distributionall values between 0 and 1, inclusiveall values sum to 1All questions/probabilities of the two variables can be calculated from the joint distributionNLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07How did you figure that out?0.92

22. Joint distributionNLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07Called “marginalization”, aka summing over a variable

23. Conditional probabilityAs we learn more information, we can update our probability distributionP(X|Y) models this (read “probability of X given Y”)What is the probability of a heads given that both sides of the coin are heads?What is the probability the document is about politics, given that it contains the word “Clinton”?What is the probability of the word “banana” given that the sentence also contains the word “split”?Notice that it is still a distribution over the values of X

24. Conditional probabilityxyIn terms of pior and joint distributions, what is the conditional probability distribution?

25. Conditional probabilityGiven that y has happened, in what proportion of those events does x also happen xy

26. Conditional probabilityGiven that y has happened, what proportion of those events does x also happen What is:p(NLPPass=true | EngPass=false)?NLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07xy

27. Conditional probabilityWhat is:p(NLPPass=true | EngPass=false)?NLPPass, EngPassP(NLPPass, EngPass)true, true.88true, false.01false, true.04false, false.07= 0.125Notice this is very different than p(NLPPass=true) = 0.89

28. A note about notationWhen talking about a particular assignment, you should technically write p(X=x), etc.However, when it’s clear, we’ll often shorten itAlso, we may also say P(X) or p(x) to generically mean any particular value, i.e. P(X=x)= 0.125

29. Properties of probabilitiesP(A or B) = ?

30. Properties of probabilitiesP(A or B) = P(A) + P(B) - P(A,B)

31. Properties of probabilitiesP(ØE) = 1– P(E)More generally:Given events E = e1, e2, …, enP(E1, E2) ≤ P(E1)

32. Chain rule (aka product rule)We can view calculating the probability of X AND Y occurring as two steps:Y occurs with some probability P(Y)Then, X occurs, given that Y has occured or you can just trust the math… 

33. Chain rule

34. Applications of the chain ruleWe saw that we could calculate the individual prior probabilities using the joint distributionWhat if we don’t have the joint distribution, but do have conditional probability information:P(Y)P(X|Y)

35. Bayes’ rule (theorem)

36. Bayes ruleAllows us to talk about P(Y|X) rather than P(X|Y)Sometimes this can be more intuitiveWhy?

37. Bayes rulep(disease | symptoms)How would you estimate this?Find a bunch of people with those symptoms and see how many have the diseaseIs this feasible?

38. Bayes rulep(disease | symptoms) p( symptoms | disease )How would you estimate this?Find a bunch of people with the disease and see how many have this set of symptoms. Much easier!

39. Bayes rulep( linguistic phenomena | features )For all examples that had those features, how many had that phenomena?p(features | linguistic phenomena)For all the examples with that phenomena, how many had this featurep(cause | effect) vs. p(effect | cause)

40. GapsI just won’t put these away.These, I just won’t put away.Vdirect objectI just won’t put away.gapfiller

41. GapsWhat did you put away?gapThe socks that I put away.gap

42. GapsWhose socks did you fold and put away?gapgapWhose socks did you fold ?gapWhose socks did you put away?gap

43. Parasitic gapsThese I’ll put away without folding .gapgapThese without folding .These I’ll put away.gapgap

44. gapParasitic gapsThese I’ll put away without folding .gap1. Cannot exist by themselves (parasitic)These I’ll put my pants away without folding .gap2. They’re optionalThese I’ll put away without folding them.gap

45. Parasitic gapshttp://literalminded.wordpress.com/2009/02/10/dougs-parasitic-gap/

46. Frequency of parasitic gapsParasitic gaps occur on average in 1/100,000 sentencesProblem:Laura Linguist has developed a complicated set of regular expressions to try and identify parasitic gaps. If a sentence has a parasitic gap, it correctly identifies it 95% of the time. If it doesn’t, it will incorrectly say it does with probability 0.005. Suppose we run it on a sentence and the algorithm says it is a parasitic gap, what is the probability it actually is?

47. Prob of parasitic gapsLaura Linguist has developed a complicated set of regular expressions to try and identify parasitic gaps. If a sentence has a parasitic gap, it correctly identifies it 95% of the time. If it doesn’t, it will incorrectly say it does with probability 0.005. Suppose we run it on a sentence and the algorithm says it is a parasitic gap, what is the probability it actually is?G = gapT = test positiveWhat question do we want to ask?

48. Prob of parasitic gapsLaura Linguist has developed a complicated set of regular expressions to try and identify parasitic gaps. If a sentence has a parasitic gap, it correctly identifies it 95% of the time. If it doesn’t, it will incorrectly say it does with probability 0.005. Suppose we run it on a sentence and the algorithm says it is a parasitic gap, what is the probability it actually is?G = gapT = test positive

49. Prob of parasitic gapsLaura Linguist has developed a complicated set of regular expressions to try and identify parasitic gaps. If a sentence has a parasitic gap, it correctly identifies it 95% of the time. If it doesn’t, it will incorrectly say it does with probability 0.005. Suppose we run it on a sentence and the algorithm says it is a parasitic gap, what is the probability it actually is?G = gapT = test positive

50. Prob of parasitic gapsLaura Linguist has developed a complicated set of regular expressions to try and identify parasitic gaps. If a sentence has a parasitic gap, it correctly identifies it 95% of the time. If it doesn’t, it will incorrectly say it does with probability 0.005. Suppose we run it on a sentence and the algorithm says it is a parasitic gap, what is the probability it actually is?G = gapT = test positive