/
Scaling Factorization Machines to Relational Data Steffen Rendle University of Konstanz Scaling Factorization Machines to Relational Data Steffen Rendle University of Konstanz

Scaling Factorization Machines to Relational Data Steffen Rendle University of Konstanz - PDF document

debby-jeon
debby-jeon . @debby-jeon
Follow
464 views
Uploaded On 2014-12-15

Scaling Factorization Machines to Relational Data Steffen Rendle University of Konstanz - PPT Presentation

rendleunikonstanzde ABSTRACT The most common approach in predictive modeling is to de scribe cases with feature vectors aka design matrix Many machine learning methods such as linear regression or sup port vector machines rely on this representation ID: 24240

rendleunikonstanzde ABSTRACT The most common

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Scaling Factorization Machines to Relati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

339 343 348 338 340 346 337 344 342 341 345 347 ScalingFactorizationMachinestoRelationalDataSteffenRendleUniversityofKonstanz78457Konstanz,Germanysteffen.rendle@uni­konstanz.deABSTRACTThemostcommonapproachinpredictivemodelingistode-scribecaseswithfeaturevectors(akadesignmatrix).Manymachinelearningmethodssuchaslinearregressionorsup-portvectormachinesrelyonthisrepresentation.However,whentheunderlyingdatahasstrongrelationalpatterns,es-peciallyrelationswithhighcardinality,thedesignmatrixcangetverylargewhichcanmakelearningandpredictionsloworeveninfeasible.Thisworksolvesthisissuebymakinguseofrepeatingpat-ternsinthedesignmatrixwhichstemfromtheunderlyingrelationalstructureofthedata.ItisshownhowcoordinatedescentlearningandBayesianMarkovChainMonteCarloinferencecanbescaledforlinearregressionandfactoriza-tionmachinemodels.Empirically,itisshownontwolargescaleandverycompetitivedatasets(Net\rixprize,KDDCup2012),that(1)standardlearningalgorithmsbasedonthedesignmatrixrepresentationcannotscaletorelationalpre-dictorvariables,(2)theproposednewalgorithmsscaleand(3)thepredictivequalityoftheproposedgenericfeature-basedapproachisasgoodasthebestspecializedmodelsthathavebeentailoredtotherespectivetasks.1.INTRODUCTIONPredictiveanalyticsisanimportanttechniquewithap-plicationsinmany eldsrangingfrombusinesstoscience.Typically,apredictivemodelisde nedasafunctionofpre-dictorvariablestosometarget.E.g.howmuch(target)doesacustomer( rstpredictorvariable)likeaproduct(secondpredictorvariable).Themostcommonapproachisthatadataanalystselectssomepredictorvariables(akafeatureen-gineering)andappliesamachinelearning(ML)methodtolearnthetargetfunctionfromobservationsofthepast.TheMLmodelisthenafunctionofthepredictorvariables(akafeaturevector).ManyimportantMLmethodsarebasedonthisprincipleincl.linearregression(LR),supportvec-tormachines(SVM),decisiontrees,etc.Theruntimeoflearningandpredictiondependsonthe(sparse)sizeofthePermissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.ArticlesfromthisvolumewereinvitedtopresenttheirresultsatThe39thInternationalConferenceonVeryLargeDataBases,August26th­30th2013,RivadelGarda,Trento,Italy.ProceedingsoftheVLDBEndowment,Vol.6,No.5Copyright2013VLDBEndowment2150­8097/13/03...$10.00.featurevectorandistypicallylinearatbest.Nowadays,featureengineeringbasedMListhedominanttechniqueinpredictiveanalytics.However,ifitisappliedtorelationaldata,especiallyinvolvingrelationsofhighcardinality,thefeaturevectorscangrowverylargewhichcanmakelearningandpredictionverysloworeveninfeasible.E.g.tofollowtheexamplefromabove,thefriendsofacustomermightbepredictiveforhis/hertaste.Usingthevariable"friendsofacustomer"inthefeaturevector(e.g.foraSVM,LR,etc.)canresultinaverylongfeaturevectorbecauseallfriends(i.e.theirIDs)areincludedinthefeaturevector.Inthispaper,itisshownhowpredictionandlearningalgo-rithmsforlinearregressionandfactorizationmachinescanbescaledtopredictorvariablesgeneratedfromrelationaldatainvolvingrelationsofhighcardinality.Theideaistomakeuseofrepeatingpatternsoverasetoffeaturevectors.Nochangeismadeonthepredictivemodelingapproachandalsonotontheunderlyingstatisticalmodel.Thusthepro-posedalgorithmslearnthesameparametersandmakethesamepredictionsbutwithamuchlowerruntimecomplex-ity.Thepaperstartswithlinearregressionasitisoneofthebest-knownMLmodelsandstillachieveshighpredictionaccuracyincompetitiveproblems(e.g.KDDCup2010[23]).Moreovertheideaofscalingiseasiertounderstandforthisbasicmodel rst.Themaincontributionisscalingfactor-izationmachines[12]whichisagenericfactorizationmodelincludingamongothersmatrixfactorization[17],SVD++[3],PITF[15],timeSVD++[5],etc.FactorizationmodelshaveshowngreatpredictiveperformanceinverycompetitivemachinelearningproblemsincludingtheNet\rixprize1,re-centKDDCups2(2010,2011,2012)aswellasotherpredictionchallenges(e.g.`WhatDoYouKnow?'Challenge3,EMIMusicHackathon4).Forbothmodels,scalingisshownforcoordinatedescent(CD)learningandforaMarkovChainMonteCarlo(MCMC)Gibbssampler.CDisoneofthemoste ectivepointestimators[2]andMCMCastate-of-the-artBayesianinferencemethod.Fromapracticalpointofview,theproposedalgorithmsallowtohandlepredictivemodelingasusual:de ningpre-dictorvariables(alsovariablesfromrelationsofhighcardi-nality)byfeatureengineeringandapplyingafeature-vector-basedMLalgorithm.Internally,thealgorithmsmakeuseoftherepeatingpatternsstemmingfromtherelationalstruc-tureofthedatatolargelyspeedupcomputation. 1http://www.netflixprize.com/2http://www.sigkdd.org/kddcup/3http://www.kaggle.com/c/WhatDoYouKnow4http://www.kaggle.com/c/MusicHackathon