1 Zhengyu Deng Jitao Sang Changsheng Xu 2 ChineseSingapore Institute of Digital Media 1 Institute of Automation Chinese Academy of Sciences 2 Outline Motivation Framework ID: 760320
Download Presentation The PPT/PDF document "Personalized Celebrity Video Search Base..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Personalized Celebrity Video Search Based on Cross-space Mining
1
Zhengyu Deng, Jitao
Sang, Changsheng Xu
2
Chinese-Singapore Institute of Digital Media
1
Institute of Automation, Chinese Academy of Sciences
Slide22
Outline
Motivation
Framework
Approach
Experiment
Conclusions
Slide3Motivation
Celebrities are often popular in multiple fields and user interests are diverse.
3
User 1
User 2
User 3
Sports video
Entertainment video
Interview video
Beckham
like
like
like
Slide4Motivation
Celebrities are often popular in multiple fields and user interests are diverse.
4
User
Sports video
Entertainment video
Music video
like
like
like
Beckham
Bieber
Lady Gaga
Slide55
Daily life
Interview
Sports
David Beckham
Non-personalized search
Motivation
Slide66
Problem and solution
Motivation
Slide7User
Celebrity
Interest space
Popularity space
Map
R
e-rank
Topic
modeling
Topic
modeling
2014/8/30
7
Framework
Slide8Approach
8
… … … … …
U1
Z1
U2
U
m
User
Interest Space
Z2
Zp
Popularity Space
C1
C2
Cn
Celebrity
Xq
W1
W2
Wx
Vocabulary
X1
X2
LDA
LDA
Random walk
P(
W
i
|Z
i
)
P(
W
i
|T
i
)
P(
T
i
|C
i
)
P(
Z
i
|U
i
)
KL-Divergence
P(
Z
i
|X
i
)
Slide9Random walk
V
j is the initial probabilistic score; pij is the transition matrix; rk(j) donote the relvence score of node j at iteration k (1) Rewrite as (2)The unique solution (3)
9
Approach
Slide10KL-Divergence
() is from interest (popularity) space. The KL-Divergence between them is (4)where denote the distribution score of topic on word .The similarity of topic and is defined as the inverse of KL-Divergence. (5)
10
Approach
Slide11Video Projection
Given a celebrity video , project it to interest space (6)where K is the topic number of interest space. M is the dimension of the vocabulary.
11
Approach
Slide12Video re-ranking
Given a user and celebrity , the score of is (7) = =where K(L) is the topic number of interest (popularity) space, () is the th th topic of interest (popularity) space, is approximated by the inverse of KL-Divergence.
12
Approach
Slide13Data Preparation
Celebrity listThe World's Most Powerful 100 Celebrities Listhttp://www.forbes.com/wealth/celebrities/listThe 30 Most Generous Celebritieshttp://www.forbes.com/sites/andersonantunes/2012/01/11/the-30-most-generous-celebrities/3/Top 200 Sexiest Actorhttp://www.imdb.com/list/Uun6vT7hWeM/For each celebrity, 200 videos are downloaded from YouTube.
13
Experiments
Slide14User and Celebrity Profiling
User registration info., favorite and uploaded videos raw tags stop words WorldNet noun tags.Celebrity Wikipedia Entry WorldNet noun tags
14
celebrityusertotalSize286200486Tags Number11424583312073
Experiments
Slide15Experimental Setting
Experiment data143 users106 celebritiesExperiment setupEach user have some videos related with a specific celebrity. Leave this videos out and learn topics.Rank this celebrity’s videos for the user.Evaluationf-Measure
15
Experiments
Slide1616
Topic simples
Experiments
Slide17Doc-Topics distribution
E.g. Celebrity ”Beckham” Topic Probability of appearance 7 0.6229086229086229 4 0.1956241956241956 0 0.04967824967824968 8 0.03963963963963964 3 0.022393822393822392 1 0.018532818532818532 6 0.017245817245817245 9 0.014414414414414415 5 0.014414414414414415 2 0.005148005148005148
17
Experiments
Slide18E.g. <topic id=“7"> <word weight="0.018062955825114312" count="478">jay</word> <word weight="0.01726939500434569" count="457">messi</word> <word weight="0.016891508899217776" count="447">real</word> <word weight="0.016551411404602652" count="438">ronaldo</word> <word weight="0.01640025696255149" count="434">kanye</word> <word weight="0.015644484752295656" count="414">west</word> <word weight="0.015606696141782866" count="413">wayne</word> <word weight="0.014964289763065412" count="396">lil</word> <word weight="0.013414956732040963" count="355">hop</word> <word weight="0.013226013679477006" count="350">lionel</word> <word weight="0.01311264784793863" count="347">beckham</word> <word weight="0.01231908702717001" count="326">beyonce</word> <word weight="0.012054566753580472" count="319">cristiano</word> <word weight="0.011941200922042096" count="316">soccer</word> <word weight="0.011941200922042096" count="316">football</word> … …
18
Topic-terms distribution
Experiments
Slide19E.g. <topic id=“4"> <word weight="0.026509629402286503" count="1382">show</word> <word weight="0.014904473260185682" count="777">david</word> <word weight="0.014444103429755236" count="753">ellen</word> <word weight="0.01430982889587969" count="746">tv</word> <word weight="0.012027161819995396" count="627">comedy</word> <word weight="0.01112560423540244" count="580">jennifer</word> <word weight="0.010857055167651347" count="566">interview</word> <word weight="0.010550141947364382" count="550">degeneres</word> <word weight="0.010166500422005677" count="530">funny</word> <word weight="0.009245760761144787" count="482">letterman</word> <word weight="0.008689480549374665" count="453">hollywood</word> <word weight="0.008497659786695312" count="443">late</word> <word weight="0.007979743727461061" count="416">talk</word> <word weight="0.007615284278370291" count="397">celebrity</word> <word weight="0.006943911608992557" count="362">television</word> … …
19
Topic-terms distribution
Experiments
Slide20Different approaches
20
Experiments
Slide21Impact of random walk
21
Experiments
Slide22Conclusions
ConclusionsWe presented a cross-space mining method to exploit the correlation between user preferences and celebrity popularities.Future workInstead of returning a ranking list, we will try to visualize the search results into semantically consistent groups.Investigate the issue of personalized query understanding in more general personalized search applications.
22
Slide23Thank you!Q&A?
23