/
Learning to laugh By: Danielle Tabashi,  Based on the article: Learning to laugh By: Danielle Tabashi,  Based on the article:

Learning to laugh By: Danielle Tabashi, Based on the article: - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
343 views
Uploaded On 2019-11-01

Learning to laugh By: Danielle Tabashi, Based on the article: - PPT Presentation

Learning to laugh By Danielle Tabashi Based on the article LEARNING TO LAUGH AUTOMATICALLY COMPUTATIONAL MODELS FOR HUMOR RECOGNITION by RADA MIHALCEA and CARLO STRAPPARAVA What makes us laugh ID: 761694

humor humorous content examples humorous humor examples content features word liners data based set specific stylistic sets results similarities

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Learning to laugh By: Danielle Tabashi, ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Learning to laugh By: Danielle Tabashi, Based on the article: LEARNING TO LAUGH (AUTOMATICALLY): COMPUTATIONALMODELS FOR HUMOR RECOGNITION by RADA MIHALCEA and CARLO STRAPPARAVA.

What makes us laugh???

What makes us laugh? Human-centric vocabulary- Fraud & Minsky: “laughter is often provoked by feelings of frustration caused by our own, sometime awkward, behavior”.~25% of the jokes in the collection include the word “you”. ~15% include the word “I”.

What makes us laugh? Negation and negative orientation- ~20% of jokes in the collection contains negation: can’t, don’t, isn’t and so on. words with negative connotations: bad, failure. For example: “ Money can’t buy you friends, but you do get a better class of enemy ”.

What makes us laugh? Professional communities -Many jokes seem to target professional communities.For example: “It was so cold last winter that I saw a lawyer with his hands in his own pockets ”.

What makes us laugh? Human “weakness”- Events or entities that are associated with “weak” human moments.For example: “ If you can’t drink and drive, then why do bars have parking lots ?”

What is one-liner?A short sentence with comic effects and an interesting linguistic structure. Interesting linguistic structure means: simple syntax using rhetoric devices frequent use of creative language constructions meant to attract the readers’ attention. Example: Take my advice; I don’t use it anyway.

So….How can we teach the computer to recognize Humor?!

The main ideaTrain the computational models on humorous and non-humorous examples.

overview Humorous and non-humorous data-sets.Automatic humor recognition.Experimental results.

Humorous and non-humorous data-sets

Step 1: Build the humorous data-set

What is bootstrapping? A process which expands the data set.We use it when one data-set is really bigger than the other.Otherwise, the learning will be biased. Input: a few examples given from the user Output: a bigger set of examples.

Web bootstrapping Small group of one liners manually identified List of web pages that include at least one of the one liners Html parsing Find more one liners. Group of one-liners Add to the group web search engine Add to the group

Constraints Problem: the process can add noisy examples.Solution: constraints: a thematic constraint –keywords in the URL. a structural constraint –list, adjacent paragraphs.

Step 2: Build the non-humorous data-set

The problem We want to learn non-humorous examples which have similar structure and vocabulary to the humorous examples. So the classifier can recognize humor-specific features.

4 sets of negative examples: Reuters titles – from news articles. A short sentence which are phrased to catch the readers’ attention.

4 sets of negative examples: Proverbs - one short sentence that transmits important facts or true experience. For example: Beauty is in the eye of the beholder.

4 sets of negative examples: British national corpus (BNC) sentences – select sentences which are similar in content with the one liners. For example: I wonder if there are some contradiction here.

4 sets of negative examples: Open mind common sense (OMCS) sentences – explanations and assertions. The comic effect of jokes is based on statements that break our commonsensical understanding of the world. For example: A file is used for keeping documents.

Build the data-set: summary Some one liners Web- based bootstrapping A lot of one liners Reuters titles BNC sentences OMCS sentences Proverbs Positive examples Negative examples

overview Humorous and non-humorous data-sets.Automatic humor recognition.Experimental results.

Automatic classification Humor-specific stylistic features Alliteration Antonymy Adult slang Content-based learning Naïve Bayes Support vector machine

Humor specific stylistic features Linguistic theories of humor have suggested many stylistic features that characterize humorous text, such as alliteration, antonymy and adult slang.

Alliteration For example: Infants don’t enjoy infancy like adults do adultery. The algorithm steps: Convert the sentence to its phonetic chain using the CMU pronunciation dictionary. In our example: Infants - infancy is converted to IH1 N F AH0 N T S - IH1 N F AH0 N S IY0. Find the longest phonetic string matching chains. Count the matchings.

Antonymy Always try to be modest, and be proud of it!Use the WordNet resource in order to recognize antonym in sentence. Specifically, use the antonymy relation among nouns, verbs, adjectives and adverbs.

Adult slang Search for sexual-oriented lexicon in the sentence.Use the WordNet Domains: extract all the synonyms sets labeled with the domain SEXUALITY.

Automatic classification Humor-specific stylistic features Alliteration Antonymy Adult slang Content-based learning Naïve Bayes Support vector machine

Content-based learning Another way to recognize humor is to use traditional text-classification. We should give the algorithm labeled examples so it can learn and classify unlabeled example. In this case, we use two algorithms: Naïve Bayes Support vector machine

Naïve Bayes The main idea is to estimate the probability of a category given a document using joint probabilities of words and documents. It assumes word independence. Build probabilities table according training set. Predict the category given a document according the probabilities table.

Example – Naïve Bayes sentence Text Label 1 I love you + 2 I hate you - 3 I love Ben + Sentence I Love You Hate Ben Label 1 1 1 1 + 2 1 1 1 - 3 1 1 + P(+)=2/3 P(-)=1/3 P( i |+)=… P( i |-)=… P(Ben|+)=.. P(Ben|-)=…

Support vector machine Each data point is a vector in the p-dimensional space. Binary classifiers. Finds the hyperplane that best separates a set of positive examples from a set of negative examples.

overview Humorous and non-humorous data-sets.Automatic humor recognition.Experimental results.

Experimental Results Heuristic using humor-specific featuresText classification with content featuresCombining stylistic and content features.

Heuristic using humor-specific features We use thresholds that are learned automatically using decision tree on a small subset.These thresholds are the only parameter required for a statement to be classified as humorous or non-humorous.We evaluate the model.

Decision treeBuild a decision tree based on a small training set. Rainy? spring winter winter Temp>25 Temp > 20 spring

Heuristic using humor-specific features-Cont. The style of Reuters titles is the most different with respect to one-liners.The style of Proverbs is the most similar.The alliteration feature is the most useful indicator of humor.

Experimental Results Heuristic using humor-specific featuresText classification with content featuresCombining stylistic and content features.

Text classification with content features The R euters are the most different with respect to one-liners. The BNC sentences are the most similar data.

Experimental Results Heuristic using humor-specific featuresText classification with content featuresCombining stylistic and content features.

Combining stylistic and content features The results: No improvement for Proverbs and OMCS because we have just seen that these statements cannot be clearly differentiated from one liners using stylistic features.

Difficulties Word similarities in different semantic spaces.Where computers fail.

Word similarities in different semantic spaces We can see that the BNC seems to be the most “neutral” in its suggestion.

Word similarities in different semantic spaces One liners do not seem to recognize the real beauty.

Word similarities in different semantic spaces The proverbs suggest that beauty vanishes soon. It can reflect the educational purpose of the proverbial sayings.

Word similarities in different semantic spaces The OMCS gives words that are related to feminine beauty.

Word similarities in different semantic spaces Reuters suggest that beauty is related to achieving economy important targets.

Word similarities in different semantic spaces The one-liners are very similar to the commonsense and BNC and very different from Reuters. This is in agreement to the content-based classification results – HUMOR TENDS TO BE SIMILAR TO REGULAR TEXT.

Discussion Word similarities in different semantic spaces.Where computers fail.

Where computers fail Irony – The humorous effect at more than 50% of the sample.The irony is targeted to the speaker / the dialogue partner / entire professional community.

Where computers fail Ambiguity – The humorous effect at more than 20% of the sample.Word ambiguity and the corresponding potential misinterpretation.For example: “Change is inevitable, except from a vending machine.”

Where computers fail Incongruity– For example: “A diplomat is someone who can tell you to go to hell in such a way that you will look forward to the trip”. The comic effect cannot be recognized via WordNet and such on. We need corpus-based approaches to incongruity detection.

Where computers fail Idiomatic expressions – For example: “I used to have an open mind, but my brains kept falling out”.The idiom ‘open mind’ means receptive to new ideas, while in the example we use ‘open’ as exposed.

Where computers fail Commonsense knowledge–For example: “I like kids, but I don’t think I could eat a whole one”. Vs “I like chickens, but I don’t think I could eat a whole one”.This one liner is based on the commonsense knowledge: one cannot eat kids.