Meiqun Hu EePeng Lim and Jing Jiang School of Information S ystems Singapore Management U niversity 1 Social tagging allows users to annotate resources with tags organize tags are keywords serving as personalized index terms that group relevant resources ID: 510205
Download Presentation The PPT/PDF document "A Probabilistic Approach to Personalized..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Probabilistic Approach to Personalized Tag Recommendation
Meiqun Hu, Ee-Peng Lim and Jing JiangSchool of Information SystemsSingapore Management University
1Slide2
Social tagging allows
users to annotate resources with tags.organizetags are keywords, serving as (personalized) index terms that group relevant resources
store
online storage gives mobility and convenience to accesssharepublished bookmarks can be viewed by other usersexploreto leverage collective wisdom to find interesting resources
Social Tagging
Image credit @ logorunner.com
2Slide3
Personalized tag recommendation aims to recommend tags to the query user for annotating the query resource.
Recommendation eases the tagging process.avoids misspelling, provides consistency
Personalized Tag Recommendation
?
Alice
3Slide4
Tag recommendation should be personalized.
users exhibit individualized choice of tag termse.g., language preferencepersonalized index for personal consumption and consistency
Why Personalize Recommendations?
Alice
4Slide5
Problem Formulation: p(
t|rq,uq)A Basic Method: freq-r, to recommend most frequent tags
assuming that
the more people have used this tag, the more likely it will be used againRef. [Golder & Huberman 2006]current state-of-the-art in many social tagging sites, e.g.,
fails to personalize the recommendations for the query user
Problem Formulation and A Basic Method
5Slide6
Scenario 1: ‘
foto’ is an infrequent tag for the resource.
Scenario 2: ‘
foto’ has not been used for the resource, but has been used by the user for annotating other resources in the past.Scenario 3: ‘foto
’ has not been used for the resource, neither has it been used by the query user, but has been used by other users for annotating other resources.
Three Scenarios
Alice
6Slide7
A Method based on Collaborative Filtering: (
knn)select the k-nearest neighbors of the query user, and recommend tags used by these neighbors for annotating the resource
classic collaborative filtering, without ratings
Ref. [Marinho & Schmidt-Thienme 2008]addresses scenario 1, but fails scenario 2,3Collaborative Filtering Method
7Slide8
To translate the resource tags to the user’s personal tags (
trans-u)to learn p(t=‘foto’|u=Alice, tr=‘photo’)
Ref. [
Wetzker et al. 2009]addresses scenario 2, but fails scenario 3, since Alice has never used ‘foto
’
Personomy Translation Method
Alice
8Slide9
To Address Scenario 3
Alice
Bob
Alice
borrow translation
9Slide10
A Probabilistic Framework
Personomy TranslationA FrameworkMeasuring User Similarity
10Slide11
To learn p(t=‘foto’|u=Bob,t
r=‘photo’) and sim(u=Bob,uq=Alice)
Proposed Framework
Alice
Bob
borrow translation
11Slide12
To learn p(t=‘foto’|u=Alice,tr
=‘photo’)Personomy Translation
[Wetzker et al. 2009]
Alice
12Slide13
sim(u,uq
)assuming that users are similar if they perform similar translationsUser profileMeasuring Similarity between Users
photo
web
foto
image
netz
internet
13Slide14
Distributional Divergence between Users
sim
(‘photo’)
(
u,u
q
)
sim
(‘web’)
(
u,u
q
)
…
S
tr
sim(u,u
q
)
Ref. [Lee 1997]
14Slide15
This framework is able to address all three scenarios
addresses scenario 1 by allowing self-translation, e.g., p(‘photo’|u,‘photo’)addresses scenario 2 by allowing the query to be most similar to himeself, e.g., sim(u
q
,uq)addresses scenario 3 by enabling borrowed translationsRemark on the 3 Scenarios
15Slide16
Experiments
Data CollectionExperimental SetupRecommendation Performance
16Slide17
train
validationtest
time frame
start ~ DEC 08JAN 09 ~ JUL 09
JUL 09 ~ DEC 09
number of resources22,389
667258number
of users1,185136
57number of tags
13,276862525
number
of assignments
253,615
2,604
1,262
average posts per user
53.695
5.699
4.895
average tag tokens
per user
3.955
3.360
4.523
average distinct tags per user
61.833
13.191
14.667
Dataset from BibSonomy
Note:
users in test set must have been appeared in validation set.
17Slide18
Methods to compare
trans-n1, trans-n2trans-u1, trans-u2[Wetzker
et al. 2009], [
Wetzker et al. 2010]knn-ur, knn-ut
interpolating with freq-
rEvaluation metric
pr-curve at top 5macro-average for usersParameter tuningmacro-average f1@5
global vs. individual settings
Experimental Setup
18Slide19
Recommendation Performance
Global Setting
19Slide20
Recommendation Performance
Individual Setting
20Slide21
u
rtags assigned
top 5 recommendations
trans-u1
trans-n1
freq
-r920
a45…57f2008, bookmarking, folksonomy, social, spam,
folksonomies, tagorapub, web20, 20, integpub
, systems, tagger, webdiplomathesiscaptchafolksonomybackground
closelyrelated
folksonomy
folksonomy
tagging
social
web20
web
spam
social
myown
mining
folksonomy
1119
d16…b50
it, news, technology, blog, feed,
technologie
kultur
online
radio
kunst
cd
news
web20
blog
software
technology
newsticker
news
pc
langde
heise
3217
467…655
annotation, ontology, knowledge, semantic
sql
erd
eclipse
tagging
folksonomy
ontology
web20
semantic
tools
survey
smilegroup
semantics
ontology
Recommendation Case Study
scenario 3 tags
21Slide22
We propose a probabilistic framework for solving the personalized tag recommendation task, which incorporate
personomy translation and borrowing translation from neighbors.We devise to use distributional divergence to measure similarity between users. Users are similar if they exhibit similar translation behavior.We find the proposed methods give superior performance than translation by the query user only and classic collaborative filtering.
Conclusion
22Slide23
Performance gain in successfully recommending scenario 3 tags.
e.g., compared with freq-re.g., resources that are inadequately taggedRecommendations strategies from the resources’ perspective.
Future Work
Thank you
23Slide24
[
Golder & Huberman 2006]
Scoot A.
Golder and Bernardo A. Huberman. Usage
Patterns of Collaborative Tagging Systems. Journal of Information Science, 32(2):198-208, 2006.
[Maronho & Schmidt-
Theime 2008]
Leandro B. Marinho and Lars Schmidt-Thieme
. Collaborative Tag Recommendations, Chater 63, pp. 533-540. Springer Berlin Heidelberg, 2008.
[Wetzker et al. 2009]
Robert
Wetzker
, Alan Said and
Carsten
Zimmermann. Understanding the User:
Peronomy
Translation for Tag Recommendation. In ECML PKDD Discovery Challenge, pp. 275-285, 2009.
[Lee 1997]
Lillian
Lee. Similarity-Based Approaches to Natural Language Processing.
Ph.D
Thesis, Harvard University, Cambridge, MA. 1997. Chapter Four.
Reference
24