/
Social/Collaborative Filtering Social/Collaborative Filtering

Social/Collaborative Filtering - PowerPoint Presentation

berey
berey . @berey
Follow
66 views
Uploaded On 2023-06-23

Social/Collaborative Filtering - PPT Presentation

Outline Recap SVD vs PCA Collaborative filtering aka Social recommendation kNN CF methods classification CF via MF MF vs SGD vs Dimensionality Reduction and Principle Components Analysis Recap ID: 1002495

collaborative matrix filtering user matrix collaborative user filtering movie viewers features ratings examples classification pca predict social latent basu

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Social/Collaborative Filtering" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Social/Collaborative Filtering

2. OutlineRecapSVD vs PCACollaborative filteringaka Social recommendationk-NN CF methodsclassificationCF via MFMF vs SGD vs ….

3. Dimensionality Reductionand Principle Components Analysis: Recap3

4. More cartoons4first PC - most variablesecond PC

5. PCA as matrices: the optimization problem

6. PCA as matrices10,000 pixels1000 images1000 * 10,000,00x1y1x2y2....……xnyna1a2..…amb1b2……bmv11……………vij…vnm~V[i,j] = pixel j in image i2 mixing wtsPC1PC2

7. PollTrue or false: the weights for an example should add up to 1.True or false: the weights for a prototype should add up to 1

8. Implementing PCA8Also: they are orthogonal to each other!

9. PCA for Modeling Text(SVD = Singular Value Decomposition)9

10. A Scalability Problem with PCACovariance matrix is large in high dimensionsWith d features covariance matrix is d*dSVD is a closely-related method that can be implemented more efficiently in high dimensionsDon’t explicitly compute covariance matrixInstead write docterm matrix X as X=USVTS is k*k where k<<dS is diagonal and S[i,i]=sqrt(λi) the i-th eigenvecColumns of V ~= principle componentsRows of US ~= embedding for examples10

11. Recovering latent factors in a matrixm termsn documentsdoc term matrixx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~Docterm[i,j] = TFIDF score of term j in doc i11UVTs100s2S

12. SVD example12

13. The Neatest Little Guide to Stock Market InvestingInvesting For Dummies, 4th EditionThe Little Book of Common Sense Investing: The Only Way to Guarantee Your Fair Share of Stock Market ReturnsThe Little Book of Value InvestingValue Investing: From Graham to Buffett and BeyondRich Dad’s Guide to Investing: What the Rich Invest in, That the Poor and the Middle Class Do Not!Investing in Real Estate, 5th EditionStock Investing For DummiesRich Dad’s Advisors: The ABC’s of Real Estate Investing: The Secrets of Finding Hidden Profits Most Investors Misshttps://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/

14. https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/

15. =

16. Investing for real estateRich Dad’s Advisor’s: The ABCs of Real Estate Investment …

17. The little book of common sense investing: …Neatest Little Guide to Stock Market Investing

18. My recap: SVD vs PCAVery closely related methodsAs described hereSVD decomp doesn’t require a square matrixPCA decomp is always applied to CX which is square and mean-centeredYou can implement PCA using SVD as a substepPeople sometimes use the terms interchangeably

19. OutlineWhat is CF?Nearest-neighbor methods for CFOne old-school paper: BellCore’s movie recommenderSome general discussionCF reduced to classificationCF reduced to matrix factoringOther uses of matrix factoring in ML

20. What is Collaborative Filtering?AKA social filtering, recommendation systems, …

21. What is collaborative filtering?

22. What is collaborative filtering?

23. What is collaborative filtering?

24. What is collaborative filtering?

25.

26. What is collaborative filtering?

27. Other examples of social filtering….

28. Other examples of social filtering….

29. Other examples of social filtering….

30. Other examples of social filtering….

31. Everyday Examples of Collaborative Filtering...Bestseller listsTop 40 music listsThe “recent returns” shelf at the libraryUnmarked but well-used paths thru the woodsThe printer room at work“Read any good books lately?”....Common insight: personal tastes are correlated:If Alice and Bob both like X and Alice likes Y then Bob is more likely to like Yespecially (perhaps) if Bob knows Alice

32. Social/Collaborative Filtering: Nearest-Neighbor Methods

33. BellCore’s MovieRecommenderRecommending And Evaluating Choices In A Virtual Community Of Use. Will Hill, Larry Stead, Mark Rosenstein and George Furnas, Bellcore; CHI 1995By virtual community we mean "a group of people who share characteristics and interact in essence or effect only". In other words, people in a Virtual Community influence each other as though they interacted but they do not interact. Thus we ask: "Is it possible to arrange for people to share some of the personalized informational benefits of community involvement without the associated communications costs?"

34. MovieRecommender GoalsRecommendations should:simultaneously ease and encourage rather than replace social processes....should make it easy to participate while leaving in hooks for people to pursue more personal relationships if they wish. be for sets of people not just individuals...multi-person recommending is often important, for example, when two or more people want to choose a video to watch together. be from people not a black box machine or so-called "agent". tell how much confidence to place in them, in other words they should include indications of how accurate they are.

35. BellCore’s MovieRecommenderParticipants sent email to videos@bellcore.comSystem replied with a list of 500 movies to rate on a 1-10 scale (250 random, 250 popular)Only subset need to be ratedNew participant P sends in rated movies via emailSystem compares ratings for P to ratings of (a random sample of) previous usersMost similar users are used to predict scores for unrated movies (more later)System returns recommendations in an email message.

36. Suggested Videos for: John A. Jamus. Your must-see list with predicted ratings: 7.0 "Alien (1979)" 6.5 "Blade Runner" 6.2 "Close Encounters Of The Third Kind (1977)" Your video categories with average ratings: 6.7 "Action/Adventure" 6.5 "Science Fiction/Fantasy" 6.3 "Children/Family" 6.0 "Mystery/Suspense" 5.9 "Comedy" 5.8 "Drama"

37. The viewing patterns of 243 viewers were consulted. Patterns of 7 viewers were found to be most similar. Correlation with target viewer: 0.59 viewer-130 (unlisted@merl.com) 0.55 bullert,jane r (bullert@cc.bellcore.com) 0.51 jan_arst (jan_arst@khdld.decnet.philips.nl) 0.46 Ken Cross (moose@denali.EE.CORNELL.EDU) 0.42 rskt (rskt@cc.bellcore.com) 0.41 kkgg (kkgg@Athena.MIT.EDU) 0.41 bnn (bnn@cc.bellcore.com) By category, their joint ratings recommend: Action/Adventure: "Excalibur" 8.0, 4 viewers "Apocalypse Now" 7.2, 4 viewers "Platoon" 8.3, 3 viewers Science Fiction/Fantasy: "Total Recall" 7.2, 5 viewers Children/Family: "Wizard Of Oz, The" 8.5, 4 viewers "Mary Poppins" 7.7, 3 viewers Mystery/Suspense: "Silence Of The Lambs, The" 9.3, 3 viewers Comedy: "National Lampoon's Animal House" 7.5, 4 viewers "Driving Miss Daisy" 7.5, 4 viewers "Hannah and Her Sisters" 8.0, 3 viewers Drama: "It's A Wonderful Life" 8.0, 5 viewers "Dead Poets Society" 7.0, 5 viewers "Rain Man" 7.5, 4 viewers Correlation of predicted ratings with your actual ratings is: 0.64 This number measures ability to evaluate movies accurately for you. 0.15 means low ability. 0.85 means very good ability. 0.50 means fair ability.

38. BellCore’s MovieRecommenderEvaluation:Withhold 10% of the ratings of each user to use as a test setMeasure correlation between predicted ratings and actual ratings for test-set movie/user pairs

39.

40. BellCore’s MovieRecommenderParticipants sent email to videos@bellcore.comSystem replied with a list of 500 movies to rate New participant P sends in rated movies via emailSystem compares ratings for P to ratings of (a random sample of) previous usersMost similar users are used to predict scores for unrated moviesEmpirical Analysis of Predictive Algorithms for Collaborative Filtering Breese, Heckerman, Kadie, UAI98 System returns recommendations in an email message.

41. recap: k-nearest neighbor learning?Given a test example x:Find the k training-set examples (x1,y1),….,(xk,yk) that are closest to x.Predict the most frequent label in that set.?

42. Breaking it down:To train:save the dataTo test:For each test example x:Find the k training-set examples (x1,y1),….,(xk,yk) that are closest to x.Predict the most frequent label in that set.Very fast!Prediction is relatively slow …... but it doesn’t depend on the number of classes, only the number of neighbors

43. recap: k-nearest neighbor learning?Given a test example x:Find the k training-set examples (x1,y1),….,(xk,yk) that are closest to x.Predict the most frequent label in that set.?

44. Algorithms for Collaborative Filtering 1: Memory-Based Algorithms (Breese et al, UAI98)vi,j = vote of user i on item jIi = items for which user i has votedMean vote for i is Predicted vote for “active user” a for j is a weighted sumweight is based on similarity between user a and iweights of n similar usersnormalizer

45. Algorithms for Collaborative Filtering 1: Memory-Based Algorithms (Breese et al, UAI98)K-nearest neighborPearson correlation coefficient (Resnick ’94, Grouplens):Cosine distance, etc, …

46. Social/Collaborative Filtering: Traditional Classification

47. What are other ways to formulate the collaborative filtering problem?Treat it like ordinary classification or regression

48. Collaborative + Content Filtering(Basu et al, AAAI98; Condliff et al, AI-STATS99)AirplaneMatrixRoom with a View...Hidalgocomedy, $2Maction,$70Mromance,$25M...action,$30MJoe27,M,$70k9727Carol53,F, $20k89...Kumar25,M,$22k936Ua48,M,$81k47???

49. Collaborative + Content FilteringAs Classification (Basu, Hirsh, Cohen, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k01???Classification task: map (user,movie) pair into {likes,dislikes}Training data: known likes/dislikesTest data: active usersFeatures: any properties of user/movie pair

50. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k01???Features: any properties of user/movie pair (U,M)Examples: genre(U,M), age(U,M), income(U,M),... genre(Carol,Matrix) = action income(Kumar,Hidalgo) = 22k/year

51. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k01???Features: any properties of user/movie pair (U,M)Examples: usersWhoLikedMovie(U,M): usersWhoLikedMovie(Carol,Hidalgo) = {Joe,...,Kumar} usersWhoLikedMovie(Ua, Matrix) = {Joe,...}

52. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k01???Features: any properties of user/movie pair (U,M)Examples: moviesLikedByUser(M,U): moviesLikedByUser(*,Joe) = {Airplane,Matrix,...,Hidalgo} actionMoviesLikedByUser(*,Joe)={Matrix,Hidalgo}

53. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k11???Features: any properties of user/movie pair (U,M)genre={romance}, age=48, sex=male, income=81k, usersWhoLikedMovie={Carol}, moviesLikedByUser={Matrix,Airplane}, ...

54. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98)AirplaneMatrixRoom with a View...Hidalgocomedyactionromance...actionJoe27,M,70k1101Carol53,F,20k110...Kumar25,M,22k1001Ua48,M,81k11???genre={romance}, age=48, sex=male, income=81k, usersWhoLikedMovie={Carol}, moviesLikedByUser={Matrix,Airplane}, ...genre={action}, age=48, sex=male, income=81k, usersWhoLikedMovie = {Joe,Kumar}, moviesLikedByUser={Matrix,Airplane},...

55. Collaborative + Content FilteringAs Classification (Basu et al, AAAI98) Classification learning algorithm: rule learning (RIPPER) If NakedGun33/13 moviesLikedByUser and Joe usersWhoLikedMovie and genre=comedy then predict likes(U,M) If age>12 and age<17 and HolyGrail moviesLikedByUser and director=MelBrooks then predict likes(U,M) If Ishtar moviesLikedByUser then predict likes(U,M)

56. Basu et al 98 - resultsEvaluation:Predict liked(U,M)=“M in top quartile of U’s ranking” from features, evaluate recall and precisionFeatures:Collaborative: UsersWhoLikedMovie, UsersWhoDislikedMovie, MoviesLikedByUserContent: Actors, Directors, Genre, MPAA rating, ...Hybrid: ComediesLikedByUser, DramasLikedByUser, UsersWhoLikedFewDramas, ...Results: at same level of recall (about 33%)Ripper with collaborative features only is worse than the original MovieRecommender (by about 5 pts precision – 73 vs 78)Ripper with hybrid features is better than MovieRecommender (by about 5 pts precision)

57. Matrix Factorization for Collaborative Filtering

58. Recovering latent factors in a matrixm moviesv11………vij…vnmV[i,j] = user i’s rating of movie jn users

59. Recovering latent factors in a matrixm moviesn usersm moviesx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~V[i,j] = user i’s rating of movie j

60. talk pilfered from  …..KDD 2011

61.

62. Recovering latent factors in a matrixm moviesn usersm moviesx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~V[i,j] = user i’s rating of movie jrWHV

63.

64.

65. Recovering latent factors in a matrixm moviesn usersm moviesx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~V[i,j] = user i’s rating of movie jrWHV

66. … is like Linear Regression ….m=1 regressorsn instances (e.g., 150)predictionspl1pw1sl1sw1pl2pw2sl2sw2....……plnpwnw1w2w3w4y1…yiyn~Y[i,1] = instance i’s predictionWHYr features (eg 4)

67. .. for many outputs at once….m regressorsn instances (e.g., 150)predictionspl1pw1sl1sw1pl2pw2sl2sw2....……pln…w11w12w21..w31..w41..…y11y12…yn1~Y[I,j] = instance i’s prediction for regression task jWHYym…ynmr features (eg 4)… where we also have to find the dataset!

68. Matrix factorization as SGDstep size

69. Matrix factorization as SGD - why does this work?step size

70. Matrix factorization as SGD - why does this work? Here’s the key claim:

71. Checking the claimThink for SGD for logistic regressionLR loss = compare y and ŷ = dot(w,x)similar but now update w (user weights) and x (movie weight)

72. What loss functions are possible?“generalized” KL-divergence

73. What loss functions are possible?

74. What loss functions are possible?

75. ALS = alternating least squares

76. Matrix Multiplications in Machine Learning:MF vs PCA vs SGD vs ….

77. Recovering latent factors in a matrixm moviesn usersm moviesx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~V[i,j] = user i’s rating of movie jrWHV

78. ….. vs k-means (1)cluster meansn examples0110....……xnyna1a2..…amb1b2……bmv11………vij…vnm~original data setindicators for r clustersZMX

79. Matrix multiplication - 1 (Gram matrix)transpose of Xx1y1x2y2....……xnynx1x2..…xny1y2…yn<x1,x1>…………<xi,xj>……<xn,xn>~V[i,j] = inner product of instances i and j (Gram matrix)XXTVn instances (e.g., 150)r features (eg 2)

80. Matrix multiplication - 2 (Covariance matrix)transpose of Xa1a2…amb1b2…bmv11v12v21v22~XTx1y1x2y2....……xnynXassuming mean(X)=0r features (eg 2)CX

81. Matrix multiplication - 2 (PCA)V = CX = XT X…variance/covariancesE = eigenvectors(V)X E T = PCA(X) = Z Z(i,j) is similarity of example i to eigenvector jx1y1x2y2....……xnynXETe11e12e21e22z1z1z2z2....……znznZEigenvecsof CXI think of these as fixed point of a process where we predict each feature value from the others and CX

82. Matrix multiplication - 2 (PCA)V = CX = XT X…variance/covariancesE = eigenvectors(V)X E T = PCA(X) = ZK … or use E(1:K, :) instead of Ex1y1x2y2....……xnynXEKTe11e21z1z2..…znZKtop K eigenvecsof CX

83. Matrix multiplication - 3 (SVD)V = CX = XT X…variance/covariancesE = eigenvectors(V)X E T = PCA(X) = Z X = Z E = Z Σ-1 Σ E = U Σ E … where U = Z Σ-1 … i.e., factored version of XUsually written as X = U Σ Vx1y1x2y2....……xnynXEe11e12e21e22z1z1z2z2....……znznZEigenvecsof CX

84. Matrix multiplication - 3 (SVD)V = CX = XT X…variance/covariancesE = eigenvectors(V)X EK T = PCA(X) = ZK … EK = E(1:K, :) instead of EX ≈ ZK EK ≈ ZK Σ-1 Σ E ≈ UK Σ E … where UK = ZK Σ-1 … i.e., factored version of Xx1y1x2y2....……xnynXEe11e12e21e22z1z2..…znUKEigenvecsof CXΣΣ1

85. Matrix multiplication - 2 (SVD)eigenvectors of CXx1y1x2y2....……xnynx1x2..…xny1y2…yn<x1,x1>…………<xi,xj>……<xn,xn>~original matrixUKEXn instancesK features with zero covar and unit variancesΣΣ100Σ2

86. Recovering latent factors in a matrixm moviesn usersm moviesx1y1x2y2....……xnyna1a2..…amb1b2……bmv11………vij…vnm~V[i,j] = user i’s rating of movie jrWHV