Emotion Julia Hirschberg LSA 2017 juliacscolumbiaedu Announcement in Canvas about experimental procedures Has everyone selected their article for presentation Discussion questions Any recordings ID: 633793
Download Presentation The PPT/PDF document "1 Intonation and Computation:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1
Intonation and Computation: Emotion
Julia
Hirschberg
LSA 2017
julia@cs.columbia.eduSlide2
Announcement in Canvas about experimental proceduresHas everyone selected their article for presentation?Discussion questions?Any recordings?2Slide3
Emotion and Speaker StateA speaker’s emotional state provides important and useful informationTo recognize (e.g. anger/frustration in IVR systems)To generate (e.g. any emotion for games)Many studies have shown that acoustic information helps convey/identify ‘classic emotions’ (anger,
happiness,…) with some accuracySimilar approaches have been take to recognize other speaker statesConfidence
vs.
uncertainty
in tutoring systems
Medical diagnosis
e.g. depression, schizophrenia
DeceptionSlide4
Interspeech Paralinguistic Challenges (2009--)2009: Emotion (5)2010: Age; Gender; Degree of Interest2011: Intoxication; Sleepiness2012: Personality; Likability; Pathology2013: Social Signals (laughs, fillers); Degree of Conflict; Emotion (18); Autism2014: Cognitive Load; Physical Load2015: Degree of Nativeness; Parkinson's Condition; Eating ConditionSlide5
Basic Emotions6 basic emotions (Ekman):Happiness, sadness, fear, anger, surprise & disgustUniversal and easily recognizedSome disagreement across the field as to number and types of emotionsRecent survey by Ekman shows greatest consensus across 5 universal emotionsEkman’s Atlas of emotions (an interactive guide to human emotions)For each emotion, there is a subset of emotional states, triggers, actions and moods5Slide6
Plutchik’s WheelSlide7
7Slide8
Play Some Recorded Examples8Slide9
Theories of EmotionEmotions are a mix of:physiological activationexpressive behaviorsconscious experience9Slide10
Emotions and the Autonomic Nervous System10During an emotional experience, our autonomic nervous system mobilizes energy in the body that arouses us.Slide11
James-Lange Theory11William James and Carl Lange proposed a theory opposed to the common-sense view.The James-Lange Theory:Physiological activity precedes the emotional experience.Slide12
Cannon-Bard Theory12Walter Cannon and Phillip Bard questioned the James-Lange Theory.The Cannon-Bard Theory:Both emotion and the body's arousal take place simultaneously.Slide13
Two-Factor Theory13Stanley Schachter and Jerome Singer proposed yet another theory.The Schachter-Singer Theory:Our physiology and cognitions create emotions.Emotions have two factors–physical arousal and cognitive label.Slide14
14Physiology of Emotion Slide15
Acted Speech: LDC Emotional Speech Corpushappysadangryconfidentfrustratedfriendlyinterested
anxious
bored
encouragingSlide16
Natural Speech (
thanks to Liz
Shriberg
, SRI)
Neutral
July 30
Yes
Disappointed/tired
No
Amused/surprised
No
Annoyed
Yes
Late morning
Frustrated
Yes
No
No, I am …
…no Manila...Slide17
Synthetic Affective SpeechAcapella (2013)NeutralHappySadBad guyLaughterGreg Beller (2009)Shiva Sundaram (2007)
17Slide18
Elicited Emotion“A framework for eliciting emotional speech: capitalizing on the actor’s process” (Enos & Hirschberg, 2006)How do real actors produce emotionsScenario approachScript approach18Slide19
Decisions in Emotion RecognitionWhat kind of data should we use?Acted vs. natural vs. elicited corporaWhat can we classify?“Classic” emotionsAdditional emotionsValence and activationWhat features best predict emotions?What techniques are best for classification?19Slide20
What Features Are Useful in Emotion Classification?Mel frequency cepstral coefficients (MFCCs)Represents frequencies of the audio spectrum (20-20K Hertz) of human hearing broken in frequency bandsDifferent representations of prosodic featuresDirect modeling via acoustic correlates (pitch, intensity, rate, voice quality) useful for activation62% average baseline75% average accuracy
Symbolic representations (e.g. prosody) better for valence
Final plateau contour
correlated with negative emotions
Final fall
with positiveSlide21
Distinguishing One Emotion from the Rest: Direct Modeling (Liscombe et al 2003)Emotion
Baseline
Accuracy
angry
69.32%
77.27%
confident
75.00%
75.00%
happy
57.39%
80.11%
interested
69.89%
74.43%
encouraging
52.27%
72.73%
sad
61.93%
80.11%
anxious
55.68%
71.59%
bored
66.48%
78.98%
friendly
59.09%
73.86%
frustrated
59.09%
73.86%Slide22
Classifying Subject Ratings of Emotional Speech Using Acoustic Features22Slide23
F0 Differentiates Activation Slide24
But not ValenceSlide25
Can We Identify Emotions in Natural Speech?AT&T’s “How May I Help You?” systemLiscombe, Guicciardi, Tur & Gokken-Tur, 2005)Customers are sometimes angry, frustratedThe Data5690 operator/caller dialogues with 20,013 caller turnsLabeled for degrees of anger, frustration, negativity and collapsed to positive vs. frustrated vs. angry
Features:
Acoustic/prosodic (direct modeling)
Dialogue Acts
Lexical
Context
for each of aboveSlide26
Direct Modeling of Prosody Features in ContextSlide27
Direct Modeling of Prosodic Features in ContextSlide28
Derived Emotions: Confidence vs. UncertaintyThe ITSpoke Corpus: physics tutoring (Liscombe, Hirschberg & Venditti 2005)Collected at U. Pittsburgh by Diane Litman and students17 students, 1 tutor 130 human/human dialogues~7000 student turns (mean length ≈ 2.5 sec)Hand labeled for confidence, uncertainty, anger, frustrationSlide29
A Certain ExampleSlide30
An Uncertain ExampleSlide31
What Features Signal Confidence vs. Uncertainty?um <sigh> I don’t even think I have an idea here ...... now .. mass isn’t weight ...... mass is ................ the .......... space that an object takes up ........ is that mass?
[71-67-1:92-113]Slide32
Classifying UncertaintyHuman-Human CorpusAdaBoost (C4.5) Machine learning algorithm with 90/10 cross-validationClasses: Uncertain vs Certain vs NeutralResults:Features
Accuracy
Baseline
66%
Acoustic-prosodic
75%
+ contextual
76%
+ breath-groups
77%Slide33
Incremental Emotion Recognition from Speech Mishra & Dimitriadis, 2013Real-time emotion recognition – sliding window3 information streams: cepstral, prosodic, textualHMIHY AT&T system – Pos/neutral vs. upset33Slide34
Emotion Recognition from TextData: Social MediaEmailReviews34Slide35
Lexicon-basedUse one or several lexical resourcesKeyword-based (e.g. WordNet-Affect)Lexicon-based (e.g. EmotiNet)35Slide36
Machine Learning ApproachesSupervised LearningRequires large labeled corpusCan we automatically assign emotion labels?Better results, but domain specificUnsupervised LearningNo labeled data, possibly ontologiesFlexible – can classify beyond basic emotionsLower performance (generally)36Slide37
Emotions in Text: LiveJournal and WordsEye Image DescriptionsTask: Classify Ekman’s 6: happiness, sadness, anger, surprise, fear, and disgust Corpora:300K LiveJournal posts, labeled by authors660 WordsEye image descriptions, labeled via AMT+30K Features: TF-IDF of bigrams and emoticons, frequency features, syntactic features, lexical features, sentiment
scores (LIWC, SentiWordNet, DAL) and moreResults
of 6-way classification: 40% accuracy on
LiveJournal
posts, 63% on image descriptionsSlide38
38Slide39
Using Hashtags to Capture Fine Emotion Categories from TweetsExtrinsic evaluation – personality detection?39Slide40
Using Hashtags to Capture Fine Emotion Categories from Tweets (2013)Saif M. Mohammad and Svetlana KiritchenkoKyungnam Kim presentsSlide41
Tweets as a source of emotion-annotated textManual annotation of text with emotionsnewspaper headlines, blog sentencesBasic 6 emotions(joy, sadness, anger, fear, disgust, and surprise)Emotion
labeled by hashtag(#) from Tweets
L
abeled by users themselves
Basic 6 emotions → 585 fine emotions
→
Extraction through Tweet Archivist(http://www.tweetarchivist.com/)Slide42
Hashtag Emotion CorpusConsistencySlide43
Hashtag Emotion CorpusEmotion classification in different domainWhen used with target domain as a training data, the performance of classification is enhancedSlide44
List of words and associated emotionsVictory → ‘joy’ and ‘relief’Previous works(Manual)WordNet Affect(1536 words, 6 emotions)NRC Emotions Lexicon(14000 words, 8 emotions)Result of evaluation (classifying emotions)WordNet< Hashtag < NRCHashtag Emotion LexiconSlide45
personality and emotion“Persistent situations involving emotions produce persistent traits or personality. … Emotions are considered to be more transient phenomenon whereas personality is more constant.”Slide46
Detecting personalitySpecific questionnaires to determine personality(Big 5)ExtroversionNeuroticismAgreeabilityConscientiousnessOpennessIdentifying personality
from free-form textStream-of-consciousness essaysCollections of Facebook posts
→
Fine emotion categories
useful to determine personality(Big 5)Slide47Slide48
How about using Emoticons?Goal: augment annotated evaluation corpus with tweetsWorks well for anger in EnglishDoes every language have the same set and meanings?How often do they occur?What emotions do they represent?48Slide49
Machine Learning ApproachesUnsupervised Learning using emotion vectorsAgrawal & An, 201249Slide50
Extract content (NAVA) words from textIdentify syntactic dependencies (e.g. negation, adjectival complements and modifiers)Use semantic relatedness to compute emotion vector for each NAVA word: if words commonly appear together they must share the same emotionWhat words to use as representatives of emotions?50Slide51
Synonyms for classic emotion labels from a thesaurus, adjusting based on syntactic dependencies, aggregating the NAVA emotion vectors across the sentence and taking the average51Slide52
How well does this work?Comparing unsupervised approaches to UnSED versions on 2 labeled corpora: Alm (fairy tales) and ISEAR (interviews about emotional experiences)52Slide53
What are the Challenges in Recognizing EmotionWhat emotion to examineAnger? Interest?Is emotion the same across languages?Labeled training materialSpontaneous or actedHow to get consistent labelsWhat computational techniques to useHow large is the corpus? (DNNs?)How cleanly is it recorded (for speech)?Word embeddings (for text)53Slide54
54
Next ClassCharismatic Speech