/
Your Reactions Suggest You Liked the Movie Your Reactions Suggest You Liked the Movie

Your Reactions Suggest You Liked the Movie - PowerPoint Presentation

katrgolden
katrgolden . @katrgolden
Follow
342 views
Uploaded On 2020-06-30

Your Reactions Suggest You Liked the Movie - PPT Presentation

Automatic Content Rating via Reaction Sensing Xuan Bao Songchun Fan Romit Roy Choudhury Alexander Varshavsky Kevin A Li Rating Online Content Movies Manual rating not incentivized not easy does not reflect experience ID: 790440

user ratings movie rating ratings user rating movie reactions results reaction preliminary bootstrap motion svm sensor approach size movies

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Your Reactions Suggest You Liked the Mov..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Your Reactions Suggest You Liked the Movie Automatic Content Rating via Reaction SensingXuan Bao, Songchun Fan, Romit Roy Choudhury, Alexander Varshavsky, Kevin A. Li

Slide2

Rating Online Content (Movies)Manual rating not incentivized, not easy … does not reflect experience

Slide3

Our VisionOverall star rating

Reaction tags

Reaction-based highlights

Slide4

Our VisionOverall star rating

Reaction tags

Reaction-based highlights

Automatically

Slide5

Key Intuition

Multi-modal sensing / learning

Reactions / Ratings

02:43 - Action

09:21 - Hilarious

Overall – 5 Stars

……

12:01 - Suspense

Slide6

Specific OpportunitiesVisualFacial expressions, eye movements, lip movements …AudioLaughter, talkingMotionDevice stabilityTouch screen activitiesFast forward, rewind, checking emails and IM chats …CloudAggregate knowledge from others’ reactionsLabeled scores from some users

Slide7

Pulse: System Sketch

Slide8

Applications (Beyond Movie Ratings)Annotated movie timelineSlide forward to the action scenesPlatform for ad analyticsAssess which ads grabbing attention … Customize ads based on scenes that user reacts toPersonalized replays and automatic highlightsUser reacts to specific tennis shot, TV shows personalized replayHighlights of all exciting moments in the superbowl gameOnline video courses (MOOCs)May indicate which parts of lecture needs clarification

Early disease symptom identifcationADHD among young children, and other syndromes

Slide9

First Step: A Sensor Assisted Video Player

Slide10

Developed on Samsung Galaxy tablet (Android)Sensor meta-data layered on video as output

Sensing threads control

Observe the user from front cam

Media player control functions monitored

Pulse Media Player

Slide11

Basic Design

Features from Raw Sensor Readings

Microphone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …

Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud

Final Rating

Data Distillation Process

Slide12

Basic Design

Features from Raw Sensor Readings

Microphone, Camera, Acc, Gyro, Touch, Clicks

Reactions: Laugh, Giggle, Doze, Still, Music …

Signals to Reactions (S2R)

Reaction to Rating & Adjective (R2RA)

English Adjectives

Numeric Rating

Tag Cloud

Final Rating

Data Distillation Process

Cloud

Slide13

Visual ReactionsFacial expressions (face size, eye size, blink, etc.)Track viewers’ face through the front cameraTrack eye position and size (challenging with spectacles)Track partial faces (via SURF points matching)Face Tracking

Eye Tracking (Green)Blink (Red)

Partial Face

Slide14

Visual ReactionsFacial expressions (face size, eye size, blink, etc.)Track viewers’ face through the front cameraTrack eye position and size (challenging with spectacles)Track partial faces (via SURF points matching)Detect blinks, lip size

Look for difference between frames

Slide15

Acoustic ReactionsLaughter, Conversation, Shout-outs …Cancel out (known) movie sound from recorded sound Laughter detection, conversation detectionEven with knowledge of the original movie audio (Blue), it is hard to identify user conversation (distinguish

Red and Green)

Slide16

Acoustic ReactionsSeparating movie from user’s audioSpectral energy density comparison not adequateDifferent techniques for different volume regimes

High Volume

Low Volume

Slide17

Acoustic ReactionsLaughter, Conversation, Shout-outs …Cancel out (known) movie sound from recorded sound Laughter detection, conversation detectionEarly results demonstrate promise of detecting acoustic reactions

Slide18

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredom

Intense scene

Calm scene

Time to stretch

Slide19

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredom

Slide20

Motion ReactionsReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredomMotion readings correlate with changing in ratings …

Slide21

Motion ReactionsMotion readings correlate with changing in ratings …Timing of motions also correlate with timing of scene changesReactions also leave footprint on motion dimensionsMotionless during intense sceneFidget during boredom

Slide22

Extract Reaction Features – Player controlCollect users’ player control operationsPause, fast forward, jump, roll back, …All slider movement

Seek bar

Slide23

Challenges in Learning

Slide24

Problem – A Generalized Model Does Not WorkDirectly trained model does not capture the rating trendWhy?

Slide25

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)

homecommute

Sensed motion patterns very different when the same movie wateched

during a bus commute vs. in bed at home.

Slide26

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)Gyroscope readings from same user (at home and office)

Slide27

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different even for same user (home vs. commute)Gyroscope readings from same user (at home and office)Naïve solution  build specific models one by oneImpossible to acquire data for all <User, Context, Movie> tuples

OfficeHome

Commute…

Slide28

Challenges in LearningApproach:Bootstrap from Reaction Agreements

Slide29

Approach: Bootstrap from AgreementThoughtsWhat behavior means positive/negative for a particular settingHow do we acquire data without explicitly asking the user every time

Approach: Utilize reactions that most people agree onTime

Climax

Boring

Cloud Knowledge

(Other users’ opinions)

Sensor Reading

Ratings

Slide30

Approach: Bootstrap from AgreementSolution: spawn from consensusLearn user reactions during the “climax” and the “boring” moments Generalize this knowledge of positive/negative reactions

Gaussian process regression (ratings) and svm (labels)

GPR

SVM

Slide31

Evaluation

Slide32

User Experiment Setting11 participants watch preloaded movies (~50 movies)2 comedies, 2 dramas, 1 horror movie, 1action movieUsers provide manual ratings and labels For ground truthWe compare Pulse’s ratings with manual ratings

Slide33

Pulse Truth1234510

1000204

201301

17

0

1

4

0

0

2

5

2

5

0

0

2

1

7

Preliminary Results – Final (5 Star) Rating

Slide34

Difference with true 5 star manual ratingPreliminary Results – Final (5 Star) Rating

Slide35

Preliminary Results – Myth behind the ErrorFinal ratings can deviate significantly from the average segment ratingsUser-given scores may not be linearly related to quality

Slide36

Preliminary Results – Lower Segment Rating ErrorFinal ratings come from averaging segment ratingsOur system outperforms other methodsMean Error(5-point scale)Random ratings

CollaborativefilteringOur system

3

4

4

2

2

2

5

Per-segment ratings

Slide37

Preliminary Results – Better Tag QualityTags capture users’ feelings better than SVM aloneHappy

IntenseWarmHappyIntense

Warm

Slide38

Preliminary Results – Reasonable Energy OverheadReasonable energy overhead compared to without sensing

More tolerable on tablets. May need duty-cycling on smart phones

Slide39

Closing ThoughtsHuman reactions are in the mindHowever, manifest into bodily gestures, activitiesRich, multi-modal sensors on moble devicesA wider net for “catching” these reactionsPulse is an attempt to realize this opportunityDistilling semantic meanings from sensor streamsRating movies … tagging any content with reaction meta dataEnabler forRecommendation enginesContent/video searchInformation retrieval, summarization

Slide40

Thoughts?

Slide41

Backup – potential questionsPrivacy concernLike every technology, pulse may attract early adoptorsIf only final ratings are uploaded, the privacy level is similar to current ratingsWhy not just emotion sensing/just laughter detectionEmotion sensing is a broad and challenging problem…but the goal is different than ours (rating)… Explicit signs like laughter usually only account for a small duration of movie viewing, we need to explore other opportunities (motion)Our approach takes advantage of the specific task – 1. we know the user is watching a movie 2. we can observe the user for a longer duration (than most emotion sensing work) 3. we know other users’ opinions

How is this possible…human mind is too complexHuman thoughts are complicated… but they may produce footprints in behaviorsUsing collaborative filtering explicitly uses knowledge of other users’ thoughts to bootstrap our algorithmThe sample size is small…only 11 usersThe sample size is limited, but

Each user watched multiple movies (50+ movies viewed)… segment ratings are for 1-minute segments (thousands of points)Collaborative filtering shows that even within this data set, the ratings can diverge and naïve solution does not work as well as ours

Slide42

Preliminary Results – Better Retrieval AccuracyViewers care more about the highlights of a movieFind the contribution by using sensingGain

Additional error

Total goal

Overall achieved performance

Slide43

Challenges in Learning

Slide44

Problem – A Generalized Model Does Not WorkDirectly trained model does not capture the rating trend

Why?

Slide45

The Reason it Does Not Work is …Human behaviors are heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)

homecommute

Sensed motion patterns very different when the same movie

wateched during a bus commute vs. in bed at home.

Slide46

The Reason it Does Not Work is …Human behaviors are

heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)Impact of sensor readings  histograms

Slide47

Human behaviors are heterogeneousUsers are differentEnvironments are different (e.g., home vs. commute)Impact on sensor readings  histogramsNaïve solution  build specific models one by oneImpossible to acquire data for all <User, Context, Movie> tuples

OfficeHomeCommute

…The Reason it Does Not Work is …

Slide48

Challenges in LearningApproach:Bootstrap from Reaction Agreements

Slide49

Approach: Bootstrap from AgreementThoughtsWhat behavior means positive/negative for a particular settingHow do we acquire data without explicitly asking the user every timeApproach: Utilize reactions that most people agree on

Time

Climax

Boring

Cloud Knowledge

(Other users’ opinions)

Sensor Reading

Ratings

Slide50

Approach: Bootstrap from AgreementSolution: spawn from consensusLearn user reactions during the “climax” and the “boring” moments Generalize this knowledge of positive/negative

reactions Gaussian process regression (ratings) and svm (labels)

GPR

SVM

Slide51

Approach: Bootstrap from Agreement 

GPRA Simple Example of GPR

Slide52

Approach: Bootstrap from AgreementOn GPR and SVM - SVMSVM is a supervised learning method for classificationIdentify hyperplanes in high-dimensional space that can best separate observed samplesFor our purpose, we used non-linear SVM with RBF kernel for its wide applicability

Slide53

User Experiment Setting11 participants watch preloaded movies (46 movies)Two comedy, two dramas, one horror movie, one action movieUsers give manual ratings and labelsEvaluate by comparing generated ratings with manual ratings

Slide54

Evaluation

Slide55

Pulse Truth1234510

100020

4201301

17

0

1

4

0

0

2

5

2

5

0

0

2

1

7

Preliminary Results –

Good Final

Rating

Slide56

Preliminary Results – Myth behind the ErrorFinal ratings can deviate significantly from segment ratingUser-given scores may not be linearly related to quality

Slide57

Preliminary Results – Lower Segment Rating ErrorFinal ratings come from averaging ratings for each segmentOur system outperforms other methodsMean Error

(5-point scale)Random ratingsCollaborativefilteringOur system

3

4

4

2

2

2

5

Movie segments

Slide58

Preliminary Results – Better Retrieval AccuracyViewers care more about the highlights of a movieFind the contribution by using sensing

GainAdditional error

Total goal

Overall achieved performance

Slide59

Preliminary Results – Better Tag QualityGenerated tags captures users’ feelings much better than using SVM alone

HappyIntenseWarmHappy

IntenseWarm

Slide60

Preliminary Results – Reasonable Energy OverheadReasonable energy overhead compared to without sensing

More tolerable on tablets. May need duty-cycling on smart phones

Slide61

Closing ThoughtsHuman reactions are in the mindHowever, manifest into bodily gestures, activitiesRich, multi-modal sensors on moble devicesOpportunity for “catching” these activitiesMulti-modal capability – whole is greater than sum of partsPulse is an attempt to realize this opportunityDistilling semantic meanings from sensor streamsRating movies … tagging any content with reaction meta dataEnabler forRecommendation enginesContent/video searchInformation retrieval, summarization

Slide62

Questions?

Slide63

Extract Reaction Features – Player controlPlayer control and taps Pause, fast forward, jump, roll back, …All slider movement

Seek bar

Slide64

 Approach: Bootstrap from Agreement

GPRA Simple Example of GPR

Slide65

Approach: Bootstrap from AgreementOn GPR and SVM - SVMSVM is a supervised learning method for classificationIdentify hyperplanes in high-dimensional space that can best separate observed samplesFor our purpose, we used non-linear SVM with RBF kernel for its wide applicability