/
Your Reactions Suggest You Liked the Movie Your Reactions Suggest You Liked the Movie

Your Reactions Suggest You Liked the Movie - PowerPoint Presentation

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
371 views
Uploaded On 2018-03-11

Your Reactions Suggest You Liked the Movie - PPT Presentation

Automatic Content Rating via Reaction Sensing Xuan Bao Songchun Fan Romit Roy Choudhury Alexander Varshavsky Kevin A Li Rating Online Content Movies Manual rating not incentivized not easy does not reflect experience ID: 646388

user ratings movie rating ratings user rating movie reactions results reaction preliminary motion bootstrap sensor svm approach size movies

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Your Reactions Suggest You Liked the Mov..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Your Reactions Suggest You Liked the Movie Automatic Content Rating via Reaction SensingXuan Bao, Songchun Fan, Romit Roy Choudhury, Alexander Varshavsky, Kevin A. Li Slide2

Rating Online Content (Movies)Manual rating not incentivized, not easy … does not reflect experienceSlide3

Our VisionOverall star rating

Reaction tags

Reaction-based highlightsSlide4

Our VisionOverall star rating

Reaction tags

Reaction-based highlights

AutomaticallySlide5

Key Intuition

Multi-modal sensing / learning

Reactions / Ratings

02:43 - Action

09:21 - Hilarious

Overall – 5 Stars

……

12:01 - Suspense Slide6

Specific OpportunitiesVisualFacial expressions, eye movements, lip movements …AudioLaughter, talkingMotionDevice stabilityTouch screen activitiesFast forward, rewind, checking emails and IM chats …CloudAggregate knowledge from others’ reactionsLabeled scores from some usersSlide7

Pulse: System SketchSlide8

Applications (Beyond Movie Ratings)Annotated movie timelineSlide forward to the action scenesPlatform for ad analyticsAssess which ads grabbing attention … Customize ads based on scenes that user reacts toPersonalized replays and automatic highlightsUser reacts to specific tennis shot, TV shows personalized replayHighlights of all exciting moments in the superbowl gameOnline video courses (MOOCs)May indicate which parts of lecture needs clarification

Early disease symptom identifcationADHD among young children, and other syndromesSlide9

First Step: A Sensor Assisted Video PlayerSlide10

Developed on Samsung Galaxy tablet (Android)Sensor meta-data layered on video as output

Sensing threads control

Observe the user from front cam

Media player control functions monitored

Pulse Media PlayerSlide11

Basic Design

Features from Raw Sensor Readings

Microphone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …

Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud

Final Rating

Data Distillation ProcessSlide12

Basic Design

Features from Raw Sensor Readings

Microphone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …

Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud

Final Rating

Data Distillation Process

CloudSlide13

Visual ReactionsFacial expressions (face size, eye size, blink, etc.)Track viewers’ face through the front cameraTrack eye position and size (challenging with spectacles)Track partial faces (via SURF points matching)Face Tracking

Eye Tracking (Green)Blink (Red)

Partial FaceSlide14

Visual ReactionsFacial expressions (face size, eye size, blink, etc.)Track viewers’ face through the front cameraTrack eye position and size (challenging with spectacles)Track partial faces (via SURF points matching)Detect blinks, lip size

Look for difference between framesSlide15

Acoustic ReactionsLaughter, Conversation, Shout-outs …Cancel out (known) movie sound from recorded sound Laughter detection, conversation detectionEven with knowledge of the original movie audio (Blue), it is hard to identify user conversation (distinguish

Red and Green)Slide16

Acoustic ReactionsSeparating movie from user’s audioSpectral energy density comparison not adequateDifferent techniques for different volume regimes

High Volume

Low VolumeSlide17

Acoustic ReactionsLaughter, Conversation, Shout-outs …Cancel out (known) movie sound from recorded sound Laughter detection, conversation detectionEarly results demonstrate promise of detecting acoustic reactionsSlide18

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredom

Intense scene

Calm scene

Time to stretchSlide19

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredomSlide20

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredomMotion readings correlate with changing in ratings …Slide21

Motion ReactionsMotion readings correlate with changing in ratings …Timing of motions also correlate with timing of scene changesReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredomSlide22

Extract Reaction Features – Player controlCollect users’ player control operationsPause, fast forward, jump, roll back, …All slider movement

Seek barSlide23

Challenges in LearningSlide24

Problem – A Generalized Model Does Not WorkDirectly trained model does not capture the rating trendWhy?Slide25

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)

homecommute

Sensed motion patterns very different when the same movie wateched

during a bus commute vs. in bed at home.Slide26

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)Gyroscope readings from same user (at home and office)Slide27

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)Gyroscope readings from same user (at home and office)Naïve solution  build specific models one by oneImpossible to acquire data for all <User, Context, Movie> tuples

OfficeHome

Commute…Slide28

Challenges in LearningApproach:Bootstrap from Reaction AgreementsSlide29

Approach: Bootstrap from AgreementThoughtsWhat behavior means positive/negative for a particular settingHow do we acquire data without explicitly asking the user every time

Approach: Utilize reactions that most people agree onTime

Climax

Boring

Cloud Knowledge

(Other users’ opinions)

Sensor Reading

RatingsSlide30

Approach: Bootstrap from AgreementSolution: spawn from consensusLearn user reactions during the “climax” and the “boring” moments Generalize this knowledge of positive/negative reactions

Gaussian process regression (ratings) and svm (labels)

GPR

SVMSlide31

EvaluationSlide32

User Experiment Setting11 participants watch preloaded movies (~50 movies)2 comedies, 2 dramas, 1 horror movie, 1action movieUsers provide manual ratings and labels For ground truthWe compare Pulse’s ratings with manual ratings Slide33

Pulse Truth1234510

1000204

201301

17

0

1

4

0

0

2

5

2

5

0

0

2

1

7

Preliminary Results – Final (5 Star) RatingSlide34

Difference with true 5 star manual ratingPreliminary Results – Final (5 Star) RatingSlide35

Preliminary Results – Myth behind the ErrorFinal ratings can deviate significantly from the average segment ratingsUser-given scores may not be linearly related to qualitySlide36

Preliminary Results – Lower Segment Rating ErrorFinal ratings come from averaging segment ratingsOur system outperforms other methodsMean Error(5-point scale)Random ratings

CollaborativefilteringOur system

3

4

4

2

2

2

5

Per-segment ratingsSlide37

Preliminary Results – Better Tag QualityTags capture users’ feelings better than SVM aloneHappy

IntenseWarmHappyIntense

WarmSlide38

Preliminary Results – Reasonable Energy OverheadReasonable energy overhead compared to without sensing

More tolerable on tablets. May need duty-cycling on smart phonesSlide39

Closing ThoughtsHuman reactions are in the mindHowever, manifest into bodily gestures, activitiesRich, multi-modal sensors on moble devicesA wider net for “catching” these reactionsPulse is an attempt to realize this opportunityDistilling semantic meanings from sensor streamsRating movies … tagging any content with reaction meta dataEnabler forRecommendation enginesContent/video searchInformation retrieval, summarizationSlide40

Thoughts?Slide41

Backup – potential questionsPrivacy concernLike every technology, pulse may attract early adoptorsIf only final ratings are uploaded, the privacy level is similar to current ratingsWhy not just emotion sensing/just laughter detectionEmotion sensing is a broad and challenging problem…but the goal is different than ours (rating)… Explicit signs like laughter usually only account for a small duration of movie viewing, we need to explore other opportunities (motion)Our approach takes advantage of the specific task – 1. we know the user is watching a movie 2. we can observe the user for a longer duration (than most emotion sensing work) 3. we know other users’ opinions

How is this possible…human mind is too complexHuman thoughts are complicated… but they may produce footprints in behaviorsUsing collaborative filtering explicitly uses knowledge of other users’ thoughts to bootstrap our algorithmThe sample size is small…only 11 usersThe sample size is limited, but

Each user watched multiple movies (50+ movies viewed)… segment ratings are for 1-minute segments (thousands of points)Collaborative filtering shows that even within this data set, the ratings can diverge and naïve solution does not work as well as oursSlide42

Preliminary Results – Better Retrieval AccuracyViewers care more about the highlights of a movieFind the contribution by using sensingGain

Additional error

Total goal

Overall achieved performanceSlide43

Challenges in LearningSlide44

Problem – A Generalized Model Does Not WorkDirectly trained model does not capture the rating trend

Why?Slide45

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)

homecommute

Sensed motion patterns very different when the same movie

wateched during a bus commute vs. in bed at home.Slide46

The Reason it Does Not Work is …Human behaviors are

heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)Impact of sensor readings  histogramsSlide47

Human behaviors are heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)Impact on sensor readings  histogramsNaïve solution  build specific models one by oneImpossible to acquire data for all <User, Context, Movie> tuples

OfficeHomeCommute

…The Reason it Does Not Work is …Slide48

Challenges in LearningApproach:Bootstrap from Reaction AgreementsSlide49

Approach: Bootstrap from AgreementThoughtsWhat behavior means positive/negative for a particular settingHow do we acquire data without explicitly asking the user every timeApproach: Utilize reactions that most people agree on

Time

Climax

Boring

Cloud Knowledge

(Other users’ opinions)

Sensor Reading

RatingsSlide50

Approach: Bootstrap from AgreementSolution: spawn from consensusLearn user reactions during the “climax” and the “boring” moments Generalize this knowledge of positive/negative

reactions Gaussian process regression (ratings) and svm (labels)

GPR

SVMSlide51

Approach: Bootstrap from Agreement 

GPRA Simple Example of GPRSlide52

Approach: Bootstrap from AgreementOn GPR and SVM - SVMSVM is a supervised learning method for classificationIdentify hyperplanes in high-dimensional space that can best separate observed samplesFor our purpose, we used non-linear SVM with RBF kernel for its wide applicabilitySlide53

User Experiment Setting11 participants watch preloaded movies (46 movies)Two comedy, two dramas, one horror movie, one action movieUsers give manual ratings and labelsEvaluate by comparing generated ratings with manual ratingsSlide54

EvaluationSlide55

Pulse Truth1234510

100020

4201301

17

0

1

4

0

0

2

5

2

5

0

0

2

1

7

Preliminary Results –

Good Final

RatingSlide56

Preliminary Results – Myth behind the ErrorFinal ratings can deviate significantly from segment ratingUser-given scores may not be linearly related to qualitySlide57

Preliminary Results – Lower Segment Rating ErrorFinal ratings come from averaging ratings for each segmentOur system outperforms other methodsMean Error

(5-point scale)Random ratingsCollaborativefilteringOur system

3

4

4

2

2

2

5

Movie segmentsSlide58

Preliminary Results – Better Retrieval AccuracyViewers care more about the highlights of a movieFind the contribution by using sensing

GainAdditional error

Total goal

Overall achieved performanceSlide59

Preliminary Results – Better Tag QualityGenerated tags captures users’ feelings much better than using SVM alone

HappyIntenseWarmHappy

IntenseWarmSlide60

Preliminary Results – Reasonable Energy OverheadReasonable energy overhead compared to without sensing

More tolerable on tablets. May need duty-cycling on smart phonesSlide61

Closing ThoughtsHuman reactions are in the mindHowever, manifest into bodily gestures, activitiesRich, multi-modal sensors on moble devicesOpportunity for “catching” these activitiesMulti-modal capability – whole is greater than sum of partsPulse is an attempt to realize this opportunityDistilling semantic meanings from sensor streamsRating movies … tagging any content with reaction meta dataEnabler forRecommendation enginesContent/video searchInformation retrieval, summarizationSlide62

Questions?Slide63

Extract Reaction Features – Player controlPlayer control and taps Pause, fast forward, jump, roll back, …All slider movement

Seek barSlide64

 Approach: Bootstrap from Agreement

GPRA Simple Example of GPRSlide65

Approach: Bootstrap from AgreementOn GPR and SVM - SVMSVM is a supervised learning method for classificationIdentify hyperplanes in high-dimensional space that can best separate observed samplesFor our purpose, we used non-linear SVM with RBF kernel for its wide applicability