Matrix Factorization 1 Recovering latent factors in a matrix - PowerPoint Presentation

450 views
Uploaded On 2018-03-13

Matrix Factorization 1 Recovering latent factors in a matrix - PPT Presentation

m columns v11 vij vnm n rows 2 Recovering latent factors in a matrix K m n K x1 y1 x2 y2 xn yn a1 a2 am b1 b2 bm v11 ID: 650071

matrix investing v11 vnm investing matrix vnm v11 vij latent factors loss blocks recovering movies size collaborative filtering users

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/650071" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "Matrix Factorization 1 Recovering latent..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Matrix Factorization

1Slide2

Recovering latent factors in a matrix

columns

v11

…

vij

…

vnm

rows

2Slide3

Recovering latent factors in a matrix

K * m

n * K

…

…xnyn

…

v11

…

vij

…

vnm

3Slide4

What is this for?

K * m

n * K

…

…xnyn

…

v11

…

vij

…

vnm

4Slide5

MF for collaborative filtering

5Slide6

What is collaborative filtering?

6Slide7

What is collaborative filtering?

7Slide8

What is collaborative filtering?

8Slide9

What is collaborative filtering?

9Slide10

10Slide11

Recovering latent factors in a matrix

movies

v11

…

vij

…

vnm

i,j] = user i’s rating of movie j

users

11Slide12

Recovering latent factors in a matrix

movies

users

movies

y2....

…

v11…

……

vij…

vnm

i,j

] = user i’s rating of movie j

12Slide13

13Slide14

MF for image modeling

14Slide15

15Slide16

MF for images

10,000 pixels

1000

images

1000 * 10,000,00

…

v11

…

vij

…

vnm

i,j

] = pixel j in image

2 prototypes

PC1

PC2

16Slide17

MF for modeling text

17Slide18

The Neatest Little Guide to Stock Market Investing

Investing For Dummies, 4th Edition

The Little Book of Common Sense Investing: The Only Way to Guarantee Your Fair Share of Stock Market Returns

The Little Book of Value Investing

Value Investing: From Graham to Buffett and Beyond

Rich Dad’s Guide to Investing: What the Rich Invest in, That the Poor and the Middle Class Do Not!

Investing in Real Estate, 5th Edition

Stock Investing For Dummies

Rich Dad’s Advisors: The ABC’s of Real Estate Investing: The Secrets of Finding Hidden Profits Most Investors Miss

https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/

18Slide19

https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/

TFIDF counts would be better

19Slide20

Recovering latent factors in a matrix

terms

documents

doc term matrix

....

……

…

v11

…

……

vij…

vnm

i,j

] = TFIDF score of term j in doc

20Slide21

21Slide22

Investing for real estate

Rich Dad’s Advisor’s: The ABCs of Real Estate Investment …

22Slide23

The little book of common sense investing: …

Neatest Little Guide to Stock Market Investing

23Slide24

MF is like clustering

24Slide25

k-means as MF

cluster means

examples

…

v11

…

vij

…

vnm

original data set

indicators for r clusters

25Slide26

How do you do it?

K * m

n * K

…

…xnyn

…

v11

…

vij

…

vnm~

26Slide27

talk pilfered from



…..

KDD 2011

27Slide28

28Slide29

Recovering latent factors in a matrix

movies

users

movies

x2y2

....

……

…

v11

……

…vij

…

vnm

i,j

] = user i’s rating of movie j

29Slide30

30Slide31

31Slide32

32Slide33

or image

denoising

33Slide34

Matrix factorization as SGD

step size

why does this work?

34Slide35

Matrix factorization as SGD - why does this work? Here’s the key claim:

35Slide36

Checking the claim

Think for SGD for logistic regression

LR loss = compare

and

= dot(

w,x)similar but now update w (user weights) and x (movie weight)36Slide37

What loss functions are possible?

N1, N2 - diagonal matrixes, sort of like IDF factors for the users/movies

“generalized” KL-divergence

37Slide38

What loss functions are possible?

38Slide39

What loss functions are possible?

39Slide40

ALS = alternating least squares

40Slide41

talk pilfered from



…..

KDD 2011

41Slide42

42Slide43

43Slide44

44Slide45

Similar to McDonnell et al with perceptron learning

45Slide46

Slow convergence…..

46Slide47

47Slide48

48Slide49

49Slide50

50Slide51

51Slide52

52Slide53

More detail….

Randomly permute rows/cols of matrix

Chop V,W,H into blocks of size

d x d

m/d

blocks in W, n/d blocks in HGroup the data:Pick a set of blocks with no overlapping rows or columns (a stratum)Repeat until all blocks in V are covered

Train the SGDProcess strata in seriesProcess blocks within a stratum in parallel53Slide54

More detail….

was

54Slide55

More detail….

Initialize W,H randomly

not at zero



Choose a random ordering (random sort) of the points in a stratum in each “sub-epoch”

Pick strata sequence by permuting rows and columns of M, and using M’[k,i] as column index of row i in subepoch

k Use “bold driver” to set step size:increase step size when loss decreases (in an epoch)decrease step size when loss increasesImplemented in Hadoop and R/Snowfall

55Slide56

56Slide57

Wall Clock Time

8 nodes, 64 cores, R/snow

57Slide58

58Slide59

59Slide60

60Slide61

61Slide62

Number of Epochs

62Slide63

63Slide64