Download presentation
1 -

ModelingContextualAgreementinPreferencesLocDoSchoolofInformationSystem


DPMFwithanovelobjectivefunctionthatminimizeser-rorsinratingdi11erencesanddescribeitsgradientdescentlearningalgorithmFourthinSection6wevalidatethesemodelscomprehensivelyonthreereal-lifepubliclyavailabl

natalie's Recent Documents

vjmejohbsuofstijubufggjdjbmtboeewdbuftsljohfbmuifgsnBy Sarabeth Zemelc
vjmejohbsuofstijubufggjdjbmtboeewdbuftsljohfbmuifgsnBy Sarabeth Zemelc

mustAbeAaccomplishedAinAhealthAcareAreformAmeansAallAresourcesandAallAstakeholdersmustAbeAbroughtAintoAtheAworkOAttrengtheningAorAestablishingArelationshipsAbetweenAstateAofAhealthAcareAreformOAjnAnar

published 0K
How To Establish a Monitoring Program
How To Establish a Monitoring Program

Repeat long-term monitoring measurements compare data with Year 1 and the interpretation section of Volume IIFor more detail on monitoring program design see chapters 1-6 in Volume IIMonitoring and Ma

published 0K
1epartmetofAgriultureeScedentifyingPublicNorthCentralFtatlStnExperimen
1epartmetofAgriultureeScedentifyingPublicNorthCentralFtatlStnExperimen

CONTENTSPageEconomicModelsforAssessingResearchPriorities1IntroductionandOverview1ScoringModelsandMultipleGoalLinearProgramming2AnLPModelforEvaluatingResearchProjects5PrecedenceandCongruence6DynamicCos

published 0K
Tmhsdc Rsasdr
Tmhsdc Rsasdr

eehbd ne Rnkhc Warsd amcOA 431E01/03Dmuhqnmldmsak Oqnsdbshnmldqfdmbx Rdronmrddosdladq 1/01Afdmbx40/1F vv-doa-fnurtodqetmcrhsdrvv-bkthm-nqfExampleMNA is being used to complete groundwater cleanup at a

published 0K
ED 276 594AUTHORTITLEINSTITUTIONSPONS AGENCYPUBDATENOTEAVAILABLE FROMP
ED 276 594AUTHORTITLEINSTITUTIONSPONS AGENCYPUBDATENOTEAVAILABLE FROMP

SERAP1-1I1v1 SCDPIN072LREChmmiartryA Mcxamrri CalaeseaCP141LIJ-400C1byDonna J BognerFrJeaS4RAJHJAIMAU111 DEPARTMENT OF EDUCATIONOffice d EducationdReeserch and ImproveelentEDUCATIONALRESOURCES INFORMA

published 0K
Contact Information
Contact Information

444444444444444444444444Company NameKey ContactMailing AddressCityZipState/ProvinceMain PhoneCellFaxEmailWebsiteNameTitleNameTitleCompany OverviewPleasecompletetheformbestDesired Move-In DateAnticipat

published 1K
ATTACHMENT
ATTACHMENT

BUpdated on November 6 2020Page 1EO-14 V3 OPERATING REQUIREMENTSORGANIZED SPORTSBusinessesOrganized SportsGroup Physical ActivitiesDescriptionSports leagues associations and other organized groups eng

published 0K
Daffodil International University
Daffodil International University

MANAGEMENT OF HOSPITAL PHARMACYByMd Mazharul IslamID 101-29-164This report presented in partial fulfillment of the requirements for the degree of Bachelor of PharmacySupervised byMUHAMMAD ARIFUR RAHMA

published 0K
Download Section

Download - The PPT/PDF document "" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.






Document on Subject : "ModelingContextualAgreementinPreferencesLocDoSchoolofInformationSystem"— Transcript:

1 ModelingContextualAgreementinPreferences
ModelingContextualAgreementinPreferencesLocDoSchoolofInformationSystemsSingaporeManagementUniversityhldo.2012@phdis.smu.edu.sgHadyW.LauwSchoolofInformationSystemsSingaporeManagementUniversityhadywlauw@smu.edu.sg DPMF,withanovelobjectivefunctionthatminimizeser-rorsinratingdi erences,anddescribeitsgradientdescentlearningalgorithm.Fourth,inSection6,wevalidatethesemodelscomprehensivelyonthreereal-life,publiclyavailableratingdatasets,showinghowwellthemodelparametersarelearned,andhowtheyimproveuponsharedpreferencemod-elsinaneighborhood-basedratingpredictiontask.2.RELATEDWORKInthefollowing,wesurveyrelatedworkonmodelingpref-erences, rstfocusingonindividualusers,andthenonsim-ilaritiesbetweenusers,and nallyontheroleofcontext.Individualpreference.Mostworksonmodelingin-dividualpreferencearefoundinmodel-basedrecommendersystems[1].Themainstepistoconstructapreferencemodelforeachuser,whichisthenusedtoderivepredictions.Here,wereviewthreepopularmodelingchoices.The rstisaspectmodel[8,9].Auseru'spreferenceismodeledasaprobabil-itydistributionfP(zkju)gKk=1overKlatentaspects.Eachaspectzkisassociatedwithadistributionoveritemsitobeadopted,i.e.,P(ijzk),oroverratingsr,i.e.,P(rjzk;i).Thesecondismatrixfactorization-basedmodel[15].Useru'spreferenceismodeledasacolumnvectorSuinaK-dimensionallatentspace.Eachitemiisalsoassociatedwitharank-KcolumnvectorQi.Theratingprediction^ruibyuoniisgivenbySuTQi.Therearedi erentfactorizationmethods[18,39,17,25],whichvaryintheirobjectivefunc-tions,includingseveralprobabilisticvariants[29,34].Thethirdiscontent-basedmodel[1,31,21].Useru'spref-erenceismodeledasacontentvectorwhosedimensionalityisthevocabularysize(e.g.,tfidfvector),derivedfromthecontent(e.g.,meta-data,text)ofitemsthatulikes.Sharedpreference.Modelingsharingofpreferencesismostlyfoundinneighborhood-basedrecommendersystems[11].Oneapproachisbasedonsimilarity.Foruser-basedcollaborative ltering(CF)[12],thesimilaritywuvisbe-tweenapairofusersuandv.Thehigherwuvis,themoreuandvsharetheirpreferences.ThemostcommonsimilaritymeasuresintheliteraturearePearson'scorrelationcoe-cient[33],andvectorspaceorCosinesimilarity[5].Giventhatruandrvrepresentvectorsofratings,fruigbyuandfrvigbyv,onasetofitemsfig,PearsonisdeterminedasinEquation1(whereruandrvareaverageratings),andCosineasinEquation2.Correspondingly,foritem-basedCF[35,19],thesimilarityisbetweenapairofitems.wpearsonuv=Pi(rui�ru)(rvi�rv) p Pi(rui�ru)2p Pi(rvi�rv)2(1)wcosineuv=rurv jjrujjjjrvjj(2)Anotherapproachtomodelsharingofpreferencesistoexploitexistingstructures.Forexample,inasocialnetwork,eachrelationship(e.g.,friendsorfollower-followee)isseenasinducingsharingofpreferencesbetweenthetwousers[24,22].Someexploitthetaxonomystructuretoinducesharingbetweenitemsinthesamecategory[36,2,13,27,14].Context.Mostoftheworkdiscussedabovebasetheirap-proachesonthedyadofuser-itempair.Insomecases,addi-tionalinformationor\context"maybeavailable,i.e.,ratherthanpairshu;ii,weobservetripletshu;i;ciwherecreferstosomecontext.Therearedi erentapproachestodealingwithtriplets.Oneapproachistobreakatripletintomul-tiplebinaryrelations,e.g.,friend-user-itemintouser-friendanduser-itemsuchasdonein[23,38,37]forrating-cum-linkprediction.[41,20]suggestpartitioningdyadsintoclustersbasedoncontext,andthenlearningaseparatemodelforeachcluster.Anotherapproachistensorfactorization,suchasdonein[10]forcross-domainratingprediction.Yetan-otherapproach,suchasours,istomodeltripletsdirectly.Di erentlyfrom[32,30,16]targetinguser-item-itemtripletsforpersonalizedrankingofitems(asymmetric),wetargetuser-user-itemtripletstomodelagreement.3.OVERVIEWNotations.TheuniversalsetofusersisdenotedasU,andweuseuorvtorefertoauserinU.Inturn,weuseiorjtorefertoanitemintheuniversalsetofitemsI.Theratingbyuoniisdenotedasrui.ThesetofallratingsobservedinthedataisdenotedR.Weseektomodeluser-user-itemtripletshu;v;ii.TheuniversalsetoftripletscomprisesUUI,excludingtripletsinvolvingthesameusers,e.g.,hu;u;ii.Eachtriplethu;v;iiisassociatedwithtwoquantities(modeledasrandomvariables):xuviandyuvi,whichareessentialtoourprobabilisticmodeling.Thevariablexuvi2Risreal-valued.Itrepresentstheindicatorofagreementbetweenuandvoni,someofwhichareobservedinthedata.Thecloserxuviisto0,themorelikelyitisthatuandvagreeoni.Ifxuvi0orxuvi0,thendisagreementismorelikely.xuvicanbeexpressedasafunctionofratings,i.e.,xuvi=F(rui;rvi).Whiletherearemanypossiblede nitionsofF,inthispaper,wesimplyusetheratingdi erencebetweentwousersonthesameitem,asshowninEquation3.Thischoiceoffunctionalsoimpliesthesymmetryofxuvi=�xvui.xuvi=rui�rvi(3)Thesecondvariableyuvi2Y=f0;1gisbinary.yuvi=1representstheeventofagreementbetweenuandvontheirpreferencefori.yuvi=0istheeventofdisagreement.Theseeventsarelatent,andneverobserved.Theyaretobeestimatedfromtheobservedxuvi's.Thecloserxuviisto0,themorelikelyweexpectyuvi=1.Thefurtherxuviisawayfrom0,themorelikelyweexpectyuvi=0.ProblemFormulation.GivenratingsdataR,andtheabovexuvide nition,weseektoestimatetheprobabilityP(yuvijxuvi)foralltriplets.Notallxuvi'scanbeobserved.xuviisnotobservedifeitherrui=2Rorrvi=2R.Thisgivesrisetotwosub-problems.The rstishowtoestimateP(yuvijxuvi)giventheobservedxuvivalues.Thesecondsub-problemishowtopredicttheun-observed^xuvivalues.Forthe rstsub-problem,weproposetheprobabilisticCAMmodelinSection4.Sinceyuviislatent,itisnotpos-sibletoemploydiscriminativemodeling.Wethereforeturntogenerativemodeling,byrepresentingxuviasarandomvariable,whosegenerativeprocessisrelatedtoyuvi.OurapproachisthustomodelthejointprobabilityP(yuvi;xuvi).TheconditionalprobabilityP(yuvijxu

2 vi)canafterwardsbeestimatedfromthejointp
vi)canafterwardsbeestimatedfromthejointprobabilitiesasfollows:P(yuvijxuvi)=P(yuvi;xuvi) Py0uvi2YP(y0uvi;xuvi)(4)Thesecondsub-problemishowtopredicttheunseen^xuvi.Wewillthenusethepredicted^xuviwiththeparameters (a)Intheeventofagreement,i.e.,y=1:xN(1;21)(b)Else,intheeventofdisagreement,i.e.,y=0:x1 2N(0;20)+1 2N(�0;20)Basedonthisgenerativeprocess,thedistributionofxcanbeexpressedasamixtureofthreeGaussianswithweights ,1� 2,and1� 2respectively,asshowninEquation7.x N(1;21)+1� 2N(0;20)+1� 2N(�0;20)(7)Parameters.Fortheabovegenerativeprocess,thesetofparameterscanbeencapsulatedby=h ;1;1;0;0i.Thequestionariseswhetherthereisauniqueforeverytriplethu;v;ii.Becauseisadistributionalparameter,itisnotfeasibletoestimatefromasingleobservationofx.Anotherapproachistotietogethertheparametersofagroupoftriplets.Inthispaper,weproposetotietheparametersoftripletscorrespondingtoeachpairofusers.Inotherwords,thereisaspeci cuvforeachpairofusersuandvthatappliestoallitems.AsshownintheplatediagraminFigure2, uvanduvarewithintheplateofeachpairofusers.Forclarity,wedraw uvseparatelytoshowthatyuvionlydependson uv,although uv2uv.xuviisshaded,becauseitisobserved.4.2MonotonicityPropertyWewouldliketomodelP(y=1jx)thatincreasesasx!0,anddecreasesasx!1orx!�1.Were-fertothisasthemonotonicitypropertyoftheconditionalprobabilityofagreement.Thismonotonicitypropertydoesnotalwaysholdforanyorallparametersettings.Thereareerrantparametersettingsthatmaycausethispropertytobeviolated.Asanexample,inFigure1(b),weshowacasewhereP(y=1jx)(thegreencurve)initiallydecreasesasxgoesawayfromzero,butasxcontinuesmovingaway,itstartstoincreaseagain.Thisisnotintuitive,asitsuggeststhattheprobabilityagreementisveryhighevenasx!1.Toenforcethemonotonicityproperty,weproposeintro-ducingsomeconstrainttotheparametersoftheGaussianmixtures.ByexpandingEquation6accordingtothegener-ativeprocess,wecanexpressthep.d.f.ofP(y=1jx)asinEquation8.Here,N(x;;2)denotesthep.d.f.ofNormaldistribution,i.e.,1 p 22expf�(x�)2 22g.G(x)= N(x;0;21) N(x;0;21)+1� 2N(x;0;20)+1� 2N(x;�0;20)(8)Becausethep.d.fG(x)iscontinuousanddi erentiable,onewaytoensurethatmonotonicityholdsistoconstrainthegradientofG(x)tobenegativeforallx�0,asshowninEquation9.NotethatduetothesymmetricpropertyoftheGaussianmixtures,itissucienttoenforcethismono-tonicityforx�0,astheothercasex0ismetbydefault.@G(x) @x0;forallx&#x]TJ/;༱ ;.96;d T; 9.;ܨ ;� Td;&#x [00;0(9)BytakingthederivativeofG(x)withrespecttox,Equa-tion9canbereducedintotheinequalityinEquation10.exp4x0 220x 21�x�0 20+x 21�x+0 20�0(10)Thisinequalitystillcontainsthevariablex.Weneedtoreduceittoaninequalityinvolvingonlytheparameters.Wediscoverasimpleconstraintthatmeetsthatobjective.Proposition1.Theconstraint10ensuresthatEqua-tion10alwaysholdsforanyx&#x-285;0.Proof.Letus rstconsiderthe rstadditiveterminLHSofEquation10,i.e.,expf4x0 220g(x 21�x�0 20).Becausex,0,and0areallpositive,wehave4x0 220�0.Inturn,wehaveexpf4x0 220g�1.Because10,wealsohave(x 21�x�0 20)�0.WecanthereforetakeStep1inEquation11.FromStep1,wecangotoStep2byasimpleadditionoftheterms.Finally,becausex�0,and10,wehave2x(1 21�1 0)�0inStep3,whichconcludestheproof.exp4x0 220x 21�x�0 20+x 21�x+0 20(11)x 21�x�0 20+x 21�x+0 20(Step1)=2x1 21�1 0(Step2)�0(Step3)Wehaveshownthatwiththeconstraintof10,Equa-tion9holds,guaranteeingthemonotonicitypropertyforx&#x-285;0(andsimultaneouslyforx0).Thisconstraint10isalsointuitive,aswhentwousersareagreeingtheirratingdi erenceislikelytobesmallandnotvaryaswidelyaswhentheyaredisagreeing.4.3ParameterEstimationParameterestimationdealswithlearningtheparametersthatbest\describes"theobserveddataX=fxg.Becauseeveryxisassumedtohavebeengeneratedindependentlyinthegenerativeprocess,thelikelihoodcanbeexpressedasthejointprobabilityshowninEquation12.P(Xj)=Yx2XP(xj)(12)Thestrategyemployedinthispaperisto ndtheparam-etersthatmaximizethelikelihoodofobservingX.Duetothepresenceofconstraints,theobjectiveistoalso ndthatmeetstheconstraints,asshowninEquation13.The rstconstraintensuresthemixtureweightsoftheGaus-sianssumto1,bysettingthemixtureweightsto 1= and 0=1� respectively.ThesecondconstraintensuresthemonotonicityofP(y=1jx)bysetting10.argmaxP(Xj);subjectto: 0+ 1=1;and10(13)Tomaximizethelikelihood,wecanequivalentlymaxi-mizethelog-likelihood.Asitisaconstrainedoptimizationproblem,weemploytheuseofLagrangianmultipliers[4]toenforcetheconstraint.InEquation14,weshowtheupdatedlog-likelihoodfunctionL.Both andareLagrangian Figure6:PMFvs.PPMFvs.DPMF(RMSEdiff)overthewholetrainingset.Forall,theerrorgoesdownwiththeepochs,andeventuallyconverges.DPMFperformsthebestintworespects.First,itsconvergederroristhelowestofthethree,followedbyPMF,andPPMF(worst).Second,itachievesconvergencemuchfaster(by30epochs).Althoughby100epochs,PMFnarrowsdowntheerrorgapsomewhat,itconvergesveryslowly,requ

3 iringmoreepochs.Wehypothesizethatthisisd
iringmoreepochs.Wehypothesizethatthisisduetothedi erencesintheobjectivefunctions.PMFtriestomakeitspredictionasclosetotheobservedratingaspossible,withoutconsidera-tiononthelevelofdi erencebetweenratings.Forexample,supposeusersuandvgiveratingsof4and1respectivelytothesameiteminthetestset.Ifthepredictedratingsare4.5and0.5,thesearecloseenoughtotheactualratings(4and1).However,intermsoftheratingdi erence,ithaswidenedfrom4�1=3to4:5�0:5=4.Incontrast,DPMFtriesto ttheratingdi erencedirectly,forinstancebypre-dicting4.5and1.5,whichhasthesameerrorintermsofrating,butzeroerrorintermsofratingdi erence.Weperformone-tailedt-testwith0.01signi cancelevelontheRMSEdiffvaluesofPMFandDPMFoverdi erentepochs.Theresultcon rmsthattheoutperformancebyDPMFoverPMFisstatisticallysigni cant.VaryLatentFactors.Weconductaseparateexperi-mentonDPMFondi erentnumbersoflatentfactorsK.TheRMSEdiffat100epochsareshowninTable3.ItshowsthatbyaroundK=30,theerrorshaveconverged.Thereisnosigni cantgainbyrunninghigherlatentfactors(whichwillmakethelearningalgorithmsslower).Subse-quently,wewilluseDPMFinconjunctionwithCAMwiththesameparametersettings(K=30,100epochs).Thegradientdescentlearningalgorithmsarealsoecient.Forallthreemethods,theparameterscanbelearnedwithin Dataset NumberoflatentfactorsK 10 20 30 40 50 100 Ciao 0.87 0.43 0.36 0.36 0.35 0.34 Epinions 0.77 0.45 0.35 0.34 0.33 0.32 Flixster06 0.78 0.55 0.41 0.33 0.29 0.23 Flixster07 0.65 0.47 0.40 0.38 0.37 0.35 Flixster08 0.62 0.42 0.35 0.34 0.33 0.32 Flixster09 0.58 0.35 0.30 0.29 0.28 0.28 Table3:DPMF:VaryLatentFactors(RMSEdiff)1minuteforeachfoldonthesameIntel(R)Xeon(R)Pro-cessorE5-26672.90GHzmachine.6.4Application:CollaborativeFilteringHere,weusethemodelparametersofCAM,combinedwiththeratingdi erencepredictionsbyDPMFtogeneratecontextualagreementprobabilitieswuvi=P(yuvi=1j^xuvi).Theseprobabilitiesareusedassimilarityinneighborhood-basedcollaborative ltering,asoutlinedinSection3.Intheratingpredictiontask,foreveryratingrui2Rtest,wepredict^ruiasaweightedaverageofneighbors'ratingsinRtrain.TheaccuracyofratingpredictionismeasuredbyRMSEratingde nedinEquation29.RMSErating=Xrui2Rtests (^rui�rui)2 jRtestj(29)Contextualvs.Shared.First,wecomparethee-cacyofitem-speci ccontextualagreement(labeledCAM-DPMF)ascomparedtobaselinesrelyingonsharedpref-erencethatappliestoallitemsofthesameuserpair,asmeasuredbyPearsonandCosinefunctions(seeSection2).ThepredictionaccuraciesintermsofRMSEratingarelistedinTable4.Foreachdataset,weindicatewithan`'thebestmethodwiththelowesterror,whichissigni cantlydi erentfromthesecond-best(usingt-testsigni cancetestat0.01signi cancelevel).Forallofthedatasets,CAM-DPMFhasthelowesterrors.ForCiao,CAM-DPMFhasalowererrorthanCosineorPearson,butnotstatisticallysigni cantatp=0:01.Asallthecomparativemethodsworkwithexactlythesamesetofratings,theonlydi erenceishoweachmethodweighsthecontributionofeachrating.Thisresultshowsthatpayingattentiontocontext,asCAM-DPMFdoes,helpstogainalowerpredictionerror.Combinationvs.Components.CAM-DPMFusesacombinationofCAM'smodelparametersandDPMF'spredictedratingdi erences.Toshowthatthisjoiningofthetwocomponentsisreallynecessary,itisinstructivetoseehoweachrespectivecomponentperformsonthesametask.Wethereforeconstructtwomorebaselinesbasedoneachcomponentrespectively.The rst,calledCAM- ,usesthe uvofeachpairasanon-contextualsimilarityvalueinEquation5.ThesecondisthefactorizationmodelDPMFdescribedinSection5.WealsoincludePMFforcomplete-ness.Weusetheusers'anditems'parametersSuandQitopredictunobservedratings^rui.Table5showsacompar-isonbetweenthecombinedapproachCAM-DPMFandthetwocomponents,CAM- andfactorizationmodelsontheratingpredictiontask.In veoutofsixdatasets,CAM-DPMFhasalowererrorthanbothcomponents.Oneex-ceptionisFlixster06,wherePMFperformsslightlybetter.Interestingly,DPMFperformsverybadlyonitsown.This 9.REFERENCES[1]G.AdomaviciusandA.Tuzhilin.Towardthenextgenerationofrecommendersystems:Asurveyofthestate-of-the-artandpossibleextensions.TKDE,17(6),2005.[2]A.Ahmed,B.Kanagal,S.Pandey,V.Josifovski,L.G.Pueyo,andJ.Yuan.Latentfactormodelswithadditiveandhierarchically-smootheduserpreferences.InWSDM,2013.[3]C.M.BishopandN.M.Nasrabadi.PatternRecognitionandMachineLearning.Springer,2006.[4]S.P.BoydandL.Vandenberghe.Convexoptimization.CambridgeUniversityPress,2004.[5]J.S.Breese,D.Heckerman,andC.Kadie.Empiricalanalysisofpredictivealgorithmsforcollaborative\fltering.InUAI,1998.[6]H.Fang,Y.Baoy,andJ.Zhang.Misleadingopinionsprovidedbyadvisors:Dishonestyorsubjectivity.InIJCAI,2013.[7]T.J.Hastie,R.J.Tibshirani,andJ.H.Friedman.TheElementsofStatisticalLearning:DataMining,Inference,andPrediction.Springer,2011.[8]T.Hofmann.Collaborative\flteringviagaussianprobabilisticlatentsemanticanalysis.InSIGIR,2003.[9]T.Hofmann.Latentsemanticmodelsforcollaborative\fltering.TOIS,22(1),2004.[10]L.Hu,J.Cao,G.Xu,L.Cao,Z.Gu,andC.Zhu.Personalizedrecommendationviacross-domaintriadicfactorization.InWWW,2013.[11]D.Jannach,M.Zanker,A.Felfernig,andG.Friedrich.RecommenderSystems:AnIntroduction.CambridgeUniversityPress,2010.[12]R.Jin,J.Y.Chai,andL.Si.Anautomaticweightingschemeforcollaborative\fltering.InSIGIR,2004.[13]B.Kanagal,A.Ahmed,S.Pandey,V.Josifovski,J.Yuan,andL.Garcia-Pueyo.Superchargingrecommendersystemsusingtaxonomiesforlearninguserpurchasebehavior.PVLDB,5(10),2012.[14]N.Koenigstein,G.Dror,andY.Koren.Yahoo!musicrecommendations:modelingmusicratingswithtemporaldynamicsanditemtaxonomy.InRecSys2011.[15]Y.Koren,R.Bell,andC.Volinsky.Matrixfactorizationtechniquesforrecommendersystems.Computer,42(8),2009.[16]Y.Ko

4 renandJ.Sill.OrdRec:Anordinalmodelforpre
renandJ.Sill.OrdRec:Anordinalmodelforpredictingpersonalizeditemratingdistributions.InRecSys,2011.[17]N.D.LawrenceandR.Urtasun.Non-linearmatrixfactorizationwithgaussianprocesses.InICML,2009.[18]D.D.LeeandH.S.Seung.Learningthepartsofobjectsbynon-negativematrixfactorization.Nature401(6755),1999.[19]G.Linden,B.Smith,andJ.York.Amazon.comrecommendations:Item-to-itemcollaborative\fltering.IEEEInternetComputing,7(1),2003.[20]X.LiuandK.Aberer.SoCo:asocialnetworkaidedcontext-awarerecommendersystem.InWWW,2013.[21]P.Lops,M.deGemmis,andG.Semeraro.Content-basedrecommendersystems:Stateoftheartandtrends.InRecommenderSystemsHandbook,pages73{105.Springer,2011.[22]H.Ma,I.King,andM.R.Lyu.Learningtorecommendwithsocialtrustensemble.InSIGIR2009.[23]H.Ma,H.Yang,M.R.Lyu,andI.King.SoRec:Socialrecommendationusingprobabilisticmatrixfactorization.InCIKM,2008.[24]H.Ma,D.Zhou,C.Liu,M.R.Lyu,andI.King.Recommendersystemswithsocialregularization.In,2011.[25]L.W.Mackey,D.Weiss,andM.I.Jordan.Mixedmembershipmatrixfactorization.InICML,2010.[26]L.B.Marinho,A.Nanopoulos,L.Schmidt-Thieme,R.Jaschke,A.Hotho,G.Stumme,andP.Symeonidis.Socialtaggingrecommendersystems.InRecommenderSystemsHandbook,pages615{644.Springer,2011.[27]A.K.Menon,K.-P.Chitrapura,S.Garg,D.Agarwal,andN.Kota.Responsepredictionusingcollaborative\flteringwithhierarchiesandside-information.In,2011.[28]R.Missaoui,P.Valtchev,C.Djeraba,andM.Adda.Towardrecommendationbasedonontology-poweredweb-usagemining.IEEEInternetComputing,11(4),2007.[29]A.MnihandR.Salakhutdinov.Probabilisticmatrixfactorization.InNIPS,2007.[30]W.PanandL.Chen.GBPR:Grouppreferencebasedbayesianpersonalizedrankingforone-classcollaborative\fltering.InIJCAI,2013.[31]M.J.PazzaniandD.Billsus.Content-basedrecommendationsystems.InTheAdaptiveWeb,pages325{341.Springer,2007.[32]S.Rendle,C.Freudenthaler,Z.Gantner,andL.Schmidt-Thieme.BPR:Bayesianpersonalizedrankingfromimplicitfeedback.InUAI,2009.[33]P.Resnick,N.Iacovou,M.Suchak,P.Bergstrom,andJ.Riedl.GroupLens:anopenarchitectureforcollaborative\flteringofnetnews.InCSCW,1994.[34]R.SalakhutdinovandA.Mnih.Bayesianprobabilisticmatrixfactorizationusingmarkovchainmontecarlo.InICML,2008.[35]B.Sarwar,G.Karypis,J.Konstan,andJ.Riedl.Item-basedcollaborative\flteringrecommendationalgorithms.InWWW,2001.[36]H.Shan,J.Kattge,P.B.Reich,A.Banerjee,F.Schrodt,andM.Reichstein.Gap\fllingintheplantkingdomtraitpredictionusinghierarchicalprobabilisticmatrixfactorization.InICML,2012.[37]Y.ShenandR.Jin.Learningpersonal+sociallatentfactormodelforsocialrecommendation.In2012.[38]A.P.SinghandG.J.Gordon.Relationallearningviacollectivematrixfactorization.In,2008.[39]N.Srebro,J.Rennie,andT.S.Jaakkola.Maximum-marginmatrixfactorization.InNIPS,2004.[40]J.Wang,Y.Zhang,C.Posse,andA.Bhasin.Isittimeforacareerswitch?InWWW,2013.[41]E.Zhong,W.Fan,andQ.Yang.Contextualcollaborative\flteringviahierarchicalmatrixfactorization.In,2012. 325 Dataset CAM-DPMF SharedPreference Cosine Pearson Ciao 1.110 1.119 1.118 Epinions 1.141 1.180 1.180 Flixster06 1.084 1.144 1.143 Flixster07 1.011 1.060 1.058 Flixster08 1.051 1.081 1.079 Flixster09 1.087 1.148 1.146 Table4:VersusSharedPreference(statisticallysig-ni\fcantbest-performingentriesareasterisked) Dataset CAM-DPMF CAM- DPMF PMF Ciao 1.110 1.129 4.181 1.183 Epinions 1.141 1.198 4.075 1.194 Flixster06 1.084 1.150 3.446 1.046 Flixster07 1.011 1.073 3.532 1.073 Flixster08 1.051 1.095 3.617 1.095 Flixster09 1.087 1.152 3.595 1.152 Table5:VersusModelComponents(RMSEratingisbecauseitisoptimizedforpredictingratingdi erences,andnotratings.TheresultsemphasizetheimprovementofCAM-DPMFoversharedpreferencecomesfromthecom-plementarycombinationofbothcomponents,andnotfromthesolecontributionofeitherone.6.5CaseStudyToillustratetheworkingsofCAM,wenowshowacasestudydrawnfromtheEpinionsdataset,involvingthesamepairofusersasinSection1.Table6showstheratingsofusertalyseon)andyoungchinq)ontwentymovies.Basedontheseratings,theCAMparametersforthispairareasfollows:=0=2=0=0Therelativelylowsuggeststhatthispairdonotalwaysagree.That=29suggeststhatwhentheydisagreetheirratingdi erenceisaround3.Thisisevidentfromthefourthcolumnlabeleduvi,whichtrackstheirratingdi erences.Thelowerhalfofthetableshowsratingdi erencesaround3,suggestingthatthesearemoviesthepairdisagreeon.CAMusestheseparameterstoestimatethecontextualprobabilityofagreementshowninthe\ffthcolumn.Asex-pected,thecontextualprobabilityofagreementishigh(closeto1)forthemoviesattheupperhalfofthetable(whererat-ingdi erencesarelow),andislow(closeto0)forthemoviesatthelowerhalf.Incontrasttotheitem-speci\fcagreementproducedbyCAM,thebaselinesPearsonandCosineeachassignasinglesimilarityvaluethatappliestoallitems,inad-equatelydescribingthenatureofagreementbetweenusers.Toseethatsuchcasesofvaryingratingdi erencesarecommon,weemploytheconceptofentropyfrominforma-tiontheory.Foreachpair,wecountthefrequenciesofratingdi erences,andmeasuretheentropy,i.e.,)lnwhere)isthenormalizedfrequencyofeachratingdif-ferencevalue.Iftheentropyishigh,thepairhasratingdi erencesthatarevaried,ratherthanuniform(ifentropyislow).Forinstance,theuserpairinthecasestudyabovehasanentropyof2.3.Figure7plotsahistogramofuserpairsbinnedbytheirentropies.Thereisasigni\fcantpro-portionofthepopulationwithhighentropies.Infact,thelowentropiesaretheexception,ratherthanthenorm. ovie rui rvi jxuvij =1 ear- s- xuvi) son ne aranormalActivity 5 5 0 .00 .53 .88 ayback 3 3 0 .00 raline 5 5 0 .00 an'sLabyrinth 5 5 0 .00 emento 5 4 1 .89 anTorino 5 4 1 .89 eHurtLocker 5 4 1 .89 rassicParkIII 3 2 1 .89 ilight 3 1 2 .10 nception 5 3 2 .10 redevil 3 1 2 .10 AmLegend 4 2 2 .10 semary'sBaby 5 2 3 .00 eDayAfterTomorrow 4 1 3

5 .00 00 4 1 3 .00 oulinRouge 5 2 3 .00 e
.00 00 4 1 3 .00 oulinRouge 5 2 3 .00 evenPounds 4 1 3 .00 eDarkKnight 5 1 4 .00 eLastSamurai 5 1 4 .00 tarWarsEpisodeIII:RevengeoftheSith 5 1 4 .00 Table6:EpinionsCaseStudy Figure7:EntropyofRatingDi erencesinEpinions7.CONCLUSIONWeaddressthenovelproblemofestimatingthecontex-tualagreementbetweentwousersinthecontextofoneitem,byprobabilisticmodelingwithtwomajorcomponents.The\frst,calledCAM,modelscontextualagreementingenera-tiveform,asamixtureofGaussians.Toensuremonotonicbehavioroftheagreementprobability,weproposeaspeci\fcconstraint,anddescribehowtheconstrainedparameterscanbelearnedthroughEM.ToextendtheuseofCAMtoun-seentriplets,thesecondcomponentpredictsratingdi er-encesbetweentwousersonthesameitem.Weoutlinethreedi erentmatrixfactorizationapproaches,includingapro-posedmodelcalledDPMFwithanovelobjectivefunction.Themodelsareshowntobee ectivethroughexperimentsonreal-liferatingdatasets.Asfuturework,weplantoinves-tigatehowthetwocomponentsofourmodelcanbejoinedmoretightlytogether,suchthatthelearningforonecanhelpreinforcetheother.Inaddition,justaswecouldapplyCAM-DPMFinsimilarity-basedcollaborative\fltering,itmaybefeasibletoapplyitinmatrixfactorizationforratingpredictionaswell,whichrequiresfurtherinvestigation.8.ACKNOWLEDGMENTSThisresearchissupportedbytheSingaporeNationalRe-searchFoundationunderitsInternationalResearchCentre@SingaporeFundingInitiativeandadministeredbytheIDMProgrammeOce,MediaDevelopmentAuthority(MDA). 324 Figure6:PMFvs.PPMFvs.DPMF(RMSEdiffoverthewholetrainingset.Forall,theerrorgoesdownwiththeepochs,andeventuallyconverges.DPMFperformsthebestintworespects.First,itsconvergederroristhelowestofthethree,followedbyPMF,andPPMF(worst).Second,itachievesconvergencemuchfaster(by30epochs).Althoughby100epochs,PMFnarrowsdowntheerrorgapsomewhat,itconvergesveryslowly,requiringmoreepochs.Wehypothesizethatthisisduetothedi erencesintheobjectivefunctions.PMFtriestomakeitspredictionasclosetotheobservedratingaspossible,withoutconsidera-tiononthelevelofdi erencebetweenratings.Forexample,supposeusersandgiveratingsof4and1respectivelytothesameiteminthetestset.Ifthepredictedratingsare4.5and0.5,thesearecloseenoughtotheactualratings(4and1).However,intermsoftheratingdi erence,ithaswidenedfrom41=3to45=4.Incontrast,DPMFtriesto\fttheratingdi erencedirectly,forinstancebypre-dicting4.5and1.5,whichhasthesameerrorintermsofrating,butzeroerrorintermsofratingdi erence.Weperformone-tailedt-testwith0.01signi\fcancelevelontheRMSEdiffvaluesofPMFandDPMFoverdi erentepochs.Theresultcon\frmsthattheoutperformancebyDPMFoverPMFisstatisticallysigni\fcant.VaryLatentFactors.Weconductaseparateexperi-mentonDPMFondi erentnumbersoflatentfactorsTheRMSEdiffat100epochsareshowninTable3.Itshowsthatbyaround=30,theerrorshaveconverged.Thereisnosigni\fcantgainbyrunninghigherlatentfactors(whichwillmakethelearningalgorithmsslower).Subse-quently,wewilluseDPMFinconjunctionwithCAMwiththesameparametersettings(=30,100epochs).Thegradientdescentlearningalgorithmsarealsoecient.Forallthreemethods,theparameterscanbelearnedwithin taset umberoflatentfactors 10 20 30 40 50 100 ao .87 .43 .36 .36 .35 .34 inions .77 .45 .35 .34 .33 .32 lixster06 .78 .55 .41 .33 .29 .23 lixster07 .65 .47 .40 .38 .37 .35 lixster08 .62 .42 .35 .34 .33 .32 lixster09 .58 .35 .30 .29 .28 .28 Table3:DPMF:VaryLatentFactors(RMSEdiff1minuteforeachfoldonthesameIntel(R)Xeon(R)Pro-cessorE5-26672.90GHzmachine.6.4Application:CollaborativeFilteringHere,weusethemodelparametersofCAM,combinedwiththeratingdi erencepredictionsbyDPMFtogeneratecontextualagreementprobabilitiesuvi=P(uvi=1uviTheseprobabilitiesareusedassimilarityinneighborhood-basedcollaborative\fltering,asoutlinedinSection3.Intheratingpredictiontask,foreveryratingwepredict^asaweightedaverageofneighbors'ratingsintrain.TheaccuracyofratingpredictionismeasuredbyRMSEratingde\fnedinEquation29.RMSErating (^ (29)Contextualvs.Shared.First,wecomparethee-cacyofitem-speci\fccontextualagreement(labeledCAM-DPMF)ascomparedtobaselinesrelyingonsharedpref-erencethatappliestoallitemsofthesameuserpair,asmeasuredbyPearsonandCosinefunctions(seeSection2).ThepredictionaccuraciesintermsofRMSEratingarelistedinTable4.Foreachdataset,weindicatewithan`thebestmethodwiththelowesterror,whichissigni\fcantlydi erentfromthesecond-best(usingt-testsigni\fcancetestat0.01signi\fcancelevel).Forallofthedatasets,CAM-DPMFhasthelowesterrors.ForCiao,CAM-DPMFhasalowererrorthanCosineorPearson,butnotstatisticallysigni\fcantat=001.Asallthecomparativemethodsworkwithexactlythesamesetofratings,theonlydi erenceishoweachmethodweighsthecontributionofeachrating.Thisresultshowsthatpayingattentiontocontext,asCAM-DPMFdoes,helpstogainalowerpredictionerror.Combinationvs.ComponentsCAM-DPMFusesacombinationofCAM'smodelparametersandDPMF'spredictedratingdi erences.Toshowthatthisjoiningofthetwocomponentsisreallynecessary,itisinstructivetoseehoweachrespectivecomponentperformsonthesametask.Wethereforeconstructtwomorebaselinesbasedoneachcomponentrespectively.The\frst,calledCAM,usestheofeachpairasanon-contextualsimilarityvalueinEquation5.ThesecondisthefactorizationmodelDPMFdescribedinSection5.WealsoincludePMFforcomplete-ness.Weusetheusers'anditems'parametersandtopredictunobservedratings^.Table5showsacompar-isonbetweenthecombinedapproachCAM-DPMFandthetwocomponents,CAMandfactorizationmodelsontheratingpredictiontask.In\fveoutofsixdatasets,CAM-DPMFhasalowererrorthanbothcomponents.Oneex-ceptionisFlixster06,wherePMFperformsslightlybetter.Interestingly,DPMFperformsverybadlyonitsown.This 323 taset ginal processed Users tems atings User tems ating airs erences ao 0,

6 980 12,832 01,534 ,312 ,425 1,277 inions
980 12,832 01,534 ,312 ,425 1,277 inions 27,771 31,642 ,185,975 0,997 4,453 69,998 lixster 47,612 8,794 ,196,077 - - - Flixster06 ,682 ,421 07,044 Flixster07 ,682 ,642 06,312 Flixster08 ,682 ,018 5,210 Flixster09 ,682 ,127 4,863 Table2:Datasets Figure4:PerplexityofCAMonTestingSeteachtrain,we\decompose"eachuviintotheoriginalandvi.Similarly,iscreatedfrom,butwithanadditionalstepofremovinganyratingthatalsoexiststrain.Sincethereare30samplesfortrainandcorrespondinglythereare30samplesfortrainand6.2ContextualAgreementModelPerplexity.First,westudytheparameterlearningforCAM.AsmentionedinSection4,thereisasetofparame-ters,foreverypairofusers.Onemeasureofe ective-nessforaprobabilisticmodelisperplexity,ortheabilityofmodelparameterslearnedfromtrainingdata(trainto\ftthetestingdata().Perplexityismeasuredas ,whereisthenumberoftripletsintheheld-outtestingdata(),and)isthelike-lihoodofobservingthevalueofatripletbasedontheparameter.Ifamodeliswell-trained,theperplexitywillbelowerasitgetsbetteratgeneralizingovertheheld-outdata.Toinvestigateifthisisthecase,inFigure4,weplottheseperplexityvalues(averagedover30foldseach).Foreachdataset,wemeasuretheperplexityoflearnedmodelparametersaftereveryiterationoftheEMalgorithm.Theperplexitydecreasesquicklyinthe\frstfewiterations,andthenstabilizes.AstheEMalgorithmconvergesquicklyinimprovingthe\ftnessofthemodelparameterstothetrain-ingdata,italsoimprovesthe\ftwiththeheld-outdata.DistributionofAgreementPrior.Togetsomesenseofthelearnedparameters,wealsoinspectthedistributionofparameter's(fordi erentuserpairs).ThisparameteristhepriorprobabilityofagreementP(uvi=1)forapairofusersand.WeshowthedistributionasaseriesofwhiteboxplotsinFigure5.Itshowsthatinallsixdatasets,therearediversetypesofusers.Someuserpairstendtoagree(1)whileotherstendtodisagree(0).Mostusersaresomewhereinbetween.Themedianhoversaround0.6.Inmostdatasets,theinter-quartilerangearoundthemedianis0.3to0.4.Thisresultsupportsourintuitionthat Figure5:Distributionofuvi=1)oruserpairsdonotagreeallthetime.Mostwillhavesomedisagreements,andthereforeitisimportanttocontextualizetheiragreementonperitembasis.Notethatthis,aswellastheearlierconclusion,generallyholdsforalltheannualsubsetsofFlixsterdatasets.Friendship.Sincethedatasetsalsocontainthesocialnetworklinksamongusers,wealsotestthefrequentlymadehypothesisthatfriendshiportrustrelationshipcanhelpinlearningthepreferencesofusers[24,22].InthesameFig-ure5,wedrawthedistributionsof,narrowingdownthepopulationtoonlythoseuserpairssharingfriendshiportrustor-trusteerelationship.Thesearedrawnasredboxplots.Oneobservationisthatfriendshipdoescontainsomeinformation.Thecomparisonofeverypairofwhite(allpairs)vs.red(friends-only)boxplots,showthatfriendshavegreateragreementingeneral.Thisisespeciallyevi-dentintheFlixsterdatasets.However,anotherinterestingobservationisthatevensomefriendsdisagreealot,asshownbythelowerwhiskersoftheboxplots.Hence,justbecauseapairofusersarefriends,itdoesnotmeantheyalwaysagree.Therefore,itishelpfultoknowthecontextofagreement.TheEMlearningalgorithmsarerelativelyecient.Foreachfold,theparametersforalluserpairscanbelearnedin1to4minutesonanIntel(R)Xeon(R)ProcessorE5-26672.90GHzmachine.6.3RatingDifferencePredictionWestudytheecacyofdi erentmatrixfactorizationmethodsoutlinedinSection5(PMFPPMFandDPMFinderivinggoodpredictionsforunseentriplets.PPMFandDPMFaretrainedontrain,whilePMFistrainedonthecorrespondingtrain,allusingthesameparameterchoicesasintheoriginalpaperforPMF[29](learningrate=0.005,numberoflatentfactors=30,regularizationcoecient=0.002).AllthreearetestedonthesameForeverytripletuviinthetestset,wederiveapre-diction^uviusingeachmethod,andcomparetheaccuracyoftheirpredictionsintermsofrootmeansquarederrorcom-monlyusedinmatrixfactorization.RMSEdiffisde\fnedinEquation28.Lowervalueindicatesbetterperformance.RMSEdiffuvi (^uviuvi (28)VaryEpochs.InFigure6,weplottheRMSEdiffatdi erentepochs.Oneepochcorrespondstoafulliteration 322 Figure3:PlateDiagrams:MatrixFactorizationModelsforRatingDi erencePredictionTheestimationisdoneusinggradientdescent,withthefollowinggradients.@E @Suvi(21)@E @Quvi(22)Oncetheparametersarelearned,wethenpredicteachuvias5.3DifferentialPMF(DPMF)WhilePPMFestimates^uvidirectly,itsu ersfromtwodesignissues.First,itblowsupthenumberofparameters,aswenowhavetolearntheforeverypair,insteadofeveryuser.Second,itassumesthatthevectorsandareindependent,evenastheysharethesameuserToaddressthesede\fciencies,weproposeanewfactoriza-tionmodel,whichwecallDi erentialProbabilisticMatrixFactorizationorDPMF.TheplatediagramisshowninFig-ure3(c).Inthisapproach,wewillstillassociateeachuserwithalatentvector,andeachitemwith.Thekeydistinctionisthatweconsiderratingstobelatent,and\fttheratingdi erenceuvidirectly.Inotherwords,^uviisadrawfromthefollowingNormaldistribution(Equation23).uviN;\r)(23)TheobjectivefunctionofDPMFinEquation24showsthatwe\fttheprediction^uvitotheobservationuvivi u;v;ivi 2Xu2UjjSujj2+I (24)Estimationbygradientdescentusesthegradientsbelow.@E @Svi(25)@E @S=((vi(26)@E @Qvi)+(27)6.EXPERIMENTSOurobjectiveintheexperimentsarethree-fold.First,weinvestigatethelearningofCAM.Second,westudythee ec-tivenessofdi erentmethodsinpredictingratingdi erences.Third,wetestthecombinedmodelagainstbaselinesonanevaluativeratingpredictiontask.Inaddition,weincludeacasestudytobetterillustratetheworkingsofCAM.Ourfocushereisone ectiveness,ratherthanoncomputationaleciency.Wewillbrie\rycommentontheruntimeofthelearningalgorithmsintheappropriatesections.6.1ExperimentalSetupDatasets.Weconductexperimentsonthreereal-life,publiclyavailableratingdatasets,namely:Ciao,Epinionsand

7 Flixster.Flixstercontainsratingsonmovies
Flixster.Flixstercontainsratingsonmovies.CiaoandEpinionsbothcontainratingsonvariouscategoriessuchasbooks,electronics,movies,etc.Wedeliberatelydonotsplittheratingsbycategorytoseeifthemodelcancontextualizetheratingsperitembasiswithoutthisinformation.Ratingsarenormalizedintoa5-pointscale.Inallcases,onlyratings(andnototherinformation)areusedinlearning.Wepre-processtherawdataasfollows.First,weretainonlypairsofuserswhohaveco-ratedatleast20items.Thisistoensurethatthereissucientdatatolearnthemodelparametersreasonablyaccurately.Foreachco-rateditem,wederiveuvifromvi.Inaddition,sinceFlixsterhastimestamps,wedecidetosplittheratingsintofourannualsubsets:2006-2009,andretainonlyuserpairswhoexistinallfoursubsets.Thisistoseeiftheresultswillbeconsis-tentacrosssubsetsofthedata.ThedatasizesareshowninTable2.Afterpre-processing,allthedatasetsarestillsize-able,withthousandsofusers/items,andtenstohundredsofthousandsratingdi erences.Trainingvs.Testing.Foreachdatasetwecreatetwotypesoftraining/testingdata.ForSections6.2and6.3,weworkwithratingdi erencetripletsuvi's.Wesplittheobservedtripletsintotwosubsets:80%trainingsettrainand20%testingset.Weaveragealltheexperimentalresultsacross30suchfolds(createdbyrandomsampling).ForSection6.4,weworkwithuser-itemratings's.Toformthecorrespondingtrainingsetforratingstrainfor http://www.public.asu.edu/~jtang20/datasetcode/truststudy.htmhttp://www.cs.ubc.ca/~jamalim/datasets/ 321 multipliersfortheconstraints.Wealsointroduceaslackvariable,whosepositivevalueensuresthatlnP()+1)+)(14)Tolearntheparametersthatmaximizethelog-likelihoodfunction,weturntoExpectationMaximization(EM)al-gorithm[3].Itcanbeshownthatthederivationofwithrespecttoeachparameterleadstothefollowingcomputa-tionsintheE-stepandM-step.IntheE-step,wecomputethefollowingquantities(tobeusedinthenextM-step):)= j�;)+;)= )= j�;)= ;IntheM-stepwecomputeand ,where jXjPx2Xd(x)21=1 ,where=( EPx2X(e1(x)(x+0)2+e2(x)(x�0)2))�1 ,where)+Oncetheparametersarelearned,wecanmakeinferencesfortheposteriorprobabilityofagreementP(=1),basedonEquation6,andsubstitutingthelearnedparameters5.RATINGDIFFERENCEPREDICTIONWhileCAMcouldexplainthedistributivepropertiesofuvi'sandprovideanestimationofthecontextualagreementprobabilityP(uviuvi),itassumesthatuviisknown.Thisistrueonlyforarelativelysmallsubsetoftriplets.Inordertoextendthemodeltounseentriplets,weneedtoestimatetheunseen^uvifromratingsdata.Inspiredbypreviousworkonrecommendersystems,weadoptanapproachbasedonmatrixfactorization.Whilerelated,ourproblemisdi erentfromtraditionalrecommendersystemsintwoways.First,theobjectofinterestisatripletu;v;i,insteadofapairu;i.Second,thevaluetobeestimateduviratingdif-ference(seeEquation3),insteadofratings.Weoutlinethreematrixfactorizationapproachestosolvethisproblem.The\frst,PMF,isanexistingapproachre-purposedforourproblem.Thesecond,PPMF,isamodi\f-cation.Thethird,DPMF,isanewproposedmethod.5.1ProbabilisticMatrixFactorization(PMF)Onewaytopredict^uviisto\frstpredict^and^viandsubsequentlytakingtheirdi erence.Asarepresenta-tiveofthisapproach,weemploytheProbabilisticMatrixFactorizationorPMF[29].Thesetofratingscanberep-resentedasamatrixofsizejUjjIj,whereeachelementcorrespondstoarating.Thismatrixisincomplete,andthegoalisto\fllupthemissingentrieswithpredicted^Theapproximationusestworank-matricesjUjandjIjbeacolumnvectorinforuser.Letbeacol-umnvectorinforitemPMFplaceszero-meansphericalGaussianpriordistributionsonand(withstandarddeviationsand)tocontrolthecomplexityofthepa-rameters,i.e.,N;')andN;').TheplatediagramofPMFisshowninFigure3(a).Itshowshowratingsaregeneratedbytheparametersand.EachisassumedtobedrawnfromaGaussiandistributioncenteredatwithvariance(Equation15).N;\r(15)Parameterestimationisbymaximizingthelog-posteriordistributionoveritemanduservectorswithhyper-parameters,whichisequivalenttominimizingthesumofsquared-errorsfunctioninEquation16.u;i)isanindicatorfunctionofwhetherhasrated.Equation16containstwocompo-nents.The\frstsummandisthe\fttingconstraint,whiletherestconstitutestheregularization.The\fttingconstraintkeepsthemodelparameters\fttothetrainingdatawhereastheregularizersavoidover\ftting,makingthemodelgener-alizebetter[7].aretheregularizationparameters. u;i 2Xu2UjjSujj2+I (16)Theestimationisdoneusinggradientdescent[29],withthefollowinggradients.Oncetheparametersarelearned,wethenpredicteach^uvias@E @S(17)@E @Q(18)5.2PairwisePMF(PPMF)OnepotentialissuewiththepreviousapproachusingPMFistheindirectionofgoingthroughratings,insteadofpre-dicting^uvidirectly.Thesecondapproachistoinstead\ftanothermatrix,ofsizejUUjjIj.Eachrowcorrespondstoapairofusers.Eachcolumnrelatestoanitem.Eachelementuviistheratingdi erenceviToapproximate,weassociateeachuserpairwitharank-vector,andeachitemwith.Togenerate^uvi,wedrawitfromaNormaldistribution,asinEquation19.uviN;\r)(19)WecallthisapproachPairwisePMForPPMF.TheplatediagramisshowninFigure3(b),whichclearlyillustratesthedi erencefromPMF.InPPMF,theobservations(shaded)areuvi's,insteadofratings.TheobjectivefunctionofPPMFisspeci\fedinEquation20. 2UUu;v;iuvi 2UU (20) 320 (a)Intheeventofagreement,i.e.,=1:N;(b)Else,intheeventofdisagreement,i.e.,=0: ;)+ ;Basedonthisgenerativeprocess,thedistributionofcanbeexpressedasamixtureofthreeGaussianswithweights ,and respectively,asshowninEquation7.;)+ ;)+ ;)(7)Parameters.Fortheabovegenerativeprocess,thesetofparameterscanbeencapsulatedby ;;;;Thequestionariseswhetherthereisauniqueforeverytripletu;v;i.Becauseisadistributionalparameter,itisnotfeasibletoesti

8 matefromasingleobservationof.Anotherappr
matefromasingleobservationof.Anotherapproachistotietogethertheparametersofagroupoftriplets.Inthispaper,weproposetotietheparametersoftripletscorrespondingtoeachpairofusers.Inotherwords,thereisaspeci\fcforeachpairofusersandthatappliestoallitems.AsshownintheplatediagraminFigure2,andarewithintheplateofeachpairofusers.Forclarity,wedrawseparatelytoshowthatuvionlydependson,althoughuviisshaded,becauseitisobserved.4.2MonotonicityPropertyWewouldliketomodelP(=1)thatincreasesas0,anddecreasesas!1or!�1.Were-fertothisasthemonotonicitypropertyoftheconditionalprobabilityofagreement.Thismonotonicitypropertydoesnotalwaysholdforanyorallparametersettings.Thereareerrantparametersettingsthatmaycausethispropertytobeviolated.Asanexample,inFigure1(b),weshowacasewhereP(=1)(thegreencurve)initiallydecreasesasgoesawayfromzero,butascontinuesmovingaway,itstartstoincreaseagain.Thisisnotintuitive,asitsuggeststhattheprobabilityagreementisveryhighevenas!1Toenforcethemonotonicityproperty,weproposeintro-ducingsomeconstrainttotheparametersoftheGaussianmixtures.ByexpandingEquation6accordingtothegener-ativeprocess,wecanexpressthep.d.f.ofP(=1)asinEquation8.Here,;)denotesthep.d.f.ofNormaldistribution,i.e., p  )=;0; ;0;)+ ;)+ ;(8)Becausethep.d.f)iscontinuousanddi erentiable,onewaytoensurethatmonotonicityholdsistoconstrainthegradientof)tobenegativeforallx�0,asshowninEquation9.NotethatduetothesymmetricpropertyoftheGaussianmixtures,itissucienttoenforcethismono-tonicityforx�0,astheothercasex0ismetbydefault. @xforallx�0(9)Bytakingthederivativeof)withrespectto,Equa-tion9canbereducedintotheinequalityinEquation10.  21�x�0 20+x 21�x+0 0(10)Thisinequalitystillcontainsthevariable.Weneedtoreduceittoaninequalityinvolvingonlytheparameters.Wediscoverasimpleconstraintthatmeetsthatobjective.Proposition1.TheconstraintensuresthatEqua-tion10alwaysholdsforanyx&#x-285;Proof.Letus\frstconsiderthe\frstadditiveterminLHSofEquation10,i.e.,exp 220g(x 21�x�0 ).Because,andareallpositive,wehave 0.Inturn,wehaveexp 1.Because,wealsohave( 21�x�0 0.WecanthereforetakeStep1inEquation11.FromStep1,wecangotoStep2byasimpleadditionoftheterms.Finally,becausex�0,and,wehave 21�1 0inStep3,whichconcludestheproof.  21�x�0 20+x 21�x+0 (11) 21�x�0 20+x 21�x+0 (Step1) 21�1 (Step2)(Step3)Wehaveshownthatwiththeconstraintof,Equa-tion9holds,guaranteeingthemonotonicitypropertyforx&#x-285;0(andsimultaneouslyforx0).Thisconstraintisalsointuitive,aswhentwousersareagreeingtheirratingdi erenceislikelytobesmallandnotvaryaswidelyaswhentheyaredisagreeing.4.3ParameterEstimationParameterestimationdealswithlearningtheparametersthatbest\describes"theobserveddata.Becauseeveryisassumedtohavebeengeneratedindependentlyinthegenerativeprocess,thelikelihoodcanbeexpressedasthejointprobabilityshowninEquation12.)=(12)Thestrategyemployedinthispaperisto\fndtheparam-etersthatmaximizethelikelihoodofobserving.Duetothepresenceofconstraints,theobjectiveistoalso\fndthatmeetstheconstraints,asshowninEquation13.The\frstconstraintensuresthemixtureweightsoftheGaus-sianssumto1,bysettingthemixtureweightstoand=1respectively.ThesecondconstraintensuresthemonotonicityofP(=1)bysettingargmaxsubjectto:=1and(13)Tomaximizethelikelihood,wecanequivalentlymaxi-mizethelog-likelihood.Asitisaconstrainedoptimizationproblem,weemploytheuseofLagrangianmultipliers[4]toenforcetheconstraint.InEquation14,weshowtheupdatedlog-likelihoodfunction.BothandareLagrangian 319 Figure1:Distributionsofandlearnedinthe\frstsub-problem,toestimateP(uviuviOurkeyinsightisthatthe^uvi'sarenotindependentfromoneanother.Alltripletsinvolvingthesameitemorthesameuserpair(u;v)willsharesomedependency.Further-more,thetripletshouldmodeltheinteractionofusersanditems.Ourapproachistomodelthegenerationofuvibasedonuser-oritem-speci\fcparameterssoastogener-ate/predictunseen^uvithroughmatrixfactorizationinSec-tion5.Theframeworkcanaccommodatedi erentpredic-tivemethods.Indeedweoutlineseveralpotentialmethods,includinganewproposedmethodcalledDPMFApplication.Oneapplicationoftheagreementproba-bilitiesisasasimilarityvalueinaneighborhood-basedcol-laborative\fltering(CF).User-basedCF[11]exploitsthesimilaritiesbetweenuserstopredictunseenratings.Adopt-ingthesameratingpredictionframework,wecanusethecontextualagreementtoweighthecontributionsofneigh-bors.Topredictanunseenrating^,weuseEquation5,whichistheweightedaverageofratingsonby'sneigh-bors.Neighborcanbeanyuser,weightedbyuviviuvivi viuvi(5)Inourcase,weuseuvi=P(uvi=1uvi),whichisspe-ci\fctoeveryitem.InSection6,wewillcomparethistothetraditionalcaseofsharedpreference,wheretheweightuviissettothesimilaritybetweenand,whichisthenappliedtoallitems.ThemostpopularsimilarityfunctionsarePearson'scoecient[33]andCosinesimilarity[5].Thiscomparisonisfairasbothapproachesaregivenexactlythesamesetofratingstouse,butdi eronlyintherelativeweightsoftheratings.Notethatinthisapplication,ourob-jectiveisnottoproposeanewratingpredictionalgorithm,butrathertoillustratetheutilityofcontextualagreement,andenablecomparisontoappropriatebaselines.4.CONTEXTUALAGREEMENTMODEL4.1GenerativeModelGiventheobserveduvi's,wewanttoestimatetheprob-abilitydistributionofcontextualagreementP(uviuviWhenthecontextisclear,wesimplifythenotationsforuvianduvitoandrespectively.Becauseislatent,weestimatetheconditionalprobabilityP()fromthejointprobabilityP(y;x).Inagenerativemodelingframework,wede

9 composeP(y;x)intoP().P()correspondstothe
composeP(y;x)intoP().P()correspondstothe Figure2:PlateDiagramforCAMpriorprobabilityofagreementbetweenandon.P(isthelikelihoodthathasbeengeneratedfromThepriorofagreementP()isthebaselevelofagree-mentbetweenandbeforeseeingtheitem.Giventhattherearetwoprobableevents,i.e.,agreement(=1)anddisagreement(=0),wemodelthisasaBernoulliprocesswithaparameter.Inotherwords,thepriorofagreementisP(=1)=,andofdisagreementisP(=0)=1Intheeventofagreement(=1),willbegeneratedaccordingtoaprobabilityP(=1).Becauseisreal-valued,andweexpectthatitsvaluesintheeventofagree-mentwillclustertogether,wemodelthegenerationofasaGaussian,withanunderlyingmeanandvariance.AsmentionedinSection3,thecloserisuvito0,themorelikelyitisthatandagreeon.Therefore,wemakeasimplify-ingstep,andset=0.Welearnfromdata.ThebluecurveinFigure1(a)illustratestheprobabilitydensityfunc-tion(p.d.f.)ofP(=1),whichisaNormaldistributioncenteredat=0(inthisexample,=09).Intheeventofdisagreement(=0),willbegeneratedaccordingtotoaprobabilityP(=0).Since0or0indicatesdisagreement,themeanofthisGaussianshouldbeawayfrom0.Duetothesymmetricpropertyuvivui,wemodelthisasabimodaldistribution,suchasanequally-weightedmixtureoftwoGaussianswithpos-itivemeanatandnegativemean,andavarianceof.TheredcurveonFigure1(a)illustratesthebimodalp.d.f.ofP(=0)(inthisexample,=2=1).)canthereforebeexpressedintermsofthesecom-ponentsasshowninEquation6.ThegreencurveonFig-ure1(a)illustratesthe\decisionfunction"orthep.d.f.of=1),estimatedfromtherespectivepriorP()andlikelihoodP().Asexpected,P(=1)ishighestwhen0.Asmovesawayfrom0,theprobabilityofagree-mentdecreases,which\ftsthemodelingobjective.)= (6)GenerativeProcess.Wenowdescribethefullgenera-tiveprocessforasetofobservedtripletsForeverytriplet1.Drawanoutcomefor2fBernoulli2.Drawanoutcomefor 318 DPMF,withanovelobjectivefunctionthatminimizeser-rorsinratingdi erences,anddescribeitsgradientdescentlearningalgorithm.Fourth,inSection6,wevalidatethesemodelscomprehensivelyonthreereal-life,publiclyavailableratingdatasets,showinghowwellthemodelparametersarelearned,andhowtheyimproveuponsharedpreferencemod-elsinaneighborhood-basedratingpredictiontask.2.RELATEDWORKInthefollowing,wesurveyrelatedworkonmodelingpref-erences,\frstfocusingonindividualusers,andthenonsim-ilaritiesbetweenusers,and\fnallyontheroleofcontext.Individualpreference.Mostworksonmodelingin-dividualpreferencearefoundinmodel-basedrecommendersystems[1].Themainstepistoconstructapreferencemodelforeachuser,whichisthenusedtoderivepredictions.Here,wereviewthreepopularmodelingchoices.The\frstisaspectmodel[8,9].Auser'spreferenceismodeledasaprobabil-itydistributionoverlatentaspects.Eachaspectisassociatedwithadistributionoveritemstobeadopted,i.e.,P(),oroverratings,i.e.,P(;iThesecondismatrixfactorization-basedmodel[15].User'spreferenceismodeledasacolumnvectorinadimensionallatentspace.Eachitemisalsoassociatedwitharank-columnvector.Theratingprediction^byonisgivenby.Therearedi erentfactorizationmethods[18,39,17,25],whichvaryintheirobjectivefunc-tions,includingseveralprobabilisticvariants[29,34].Thethirdiscontent-basedmodel[1,31,21].User'spref-erenceismodeledasacontentvectorwhosedimensionalityisthevocabularysize(e.g.,idfvector),derivedfromthecontent(e.g.,meta-data,text)ofitemsthatlikes.Sharedpreference.Modelingsharingofpreferencesismostlyfoundinneighborhood-basedrecommendersystems[11].Oneapproachisbasedonsimilarity.Foruser-basedcollaborative\fltering(CF)[12],thesimilarityisbe-tweenapairofusersand.Thehigheris,themoreandsharetheirpreferences.ThemostcommonsimilaritymeasuresintheliteraturearePearson'scorrelationcoe-cient[33],andvectorspaceorCosinesimilarity[5].Giventhatandrepresentvectorsofratings,byandviby,onasetofitems,PearsonisdeterminedasinEquation1(whereandareaverageratings),andCosineasinEquation2.Correspondingly,foritem-basedCF[35,19],thesimilarityisbetweenapairofitems.pearsonvi p Pi(rui�ru)2p vi(1) jjjj(2)Anotherapproachtomodelsharingofpreferencesistoexploitexistingstructures.Forexample,inasocialnetwork,eachrelationship(e.g.,friendsorfollower-followee)isseenasinducingsharingofpreferencesbetweenthetwousers[24,22].Someexploitthetaxonomystructuretoinducesharingbetweenitemsinthesamecategory[36,2,13,27,14].Context.Mostoftheworkdiscussedabovebasetheirap-proachesonthedyadofuser-itempair.Insomecases,addi-tionalinformationor\context"maybeavailable,i.e.,ratherthanpairsu;i,weobservetripletsu;i;cwheretosomecontext.Therearedi erentapproachestodealingwithtriplets.Oneapproachistobreakatripletintomul-tiplebinaryrelations,e.g.,friend-user-itemintouser-friendanduser-itemsuchasdonein[23,38,37]forrating-cum-linkprediction.[41,20]suggestpartitioningdyadsintoclustersbasedoncontext,andthenlearningaseparatemodelforeachcluster.Anotherapproachistensorfactorization,suchasdonein[10]forcross-domainratingprediction.Yetan-otherapproach,suchasours,istomodeltripletsdirectly.Di erentlyfrom[32,30,16]targetinguser-item-itemtripletsforpersonalizedrankingofitems(asymmetric),wetargetuser-user-itemtripletstomodelagreement.3.OVERVIEWNotations.Theuniversalsetofusersisdenotedasandweuseortorefertoauserin.Inturn,weuseortorefertoanitemintheuniversalsetofitems.Theratingbyonisdenotedas.Thesetofallratingsobservedinthedataisdenoted.Weseektomodeluser-user-itemtripletsu;v;i.TheuniversalsetoftripletscomprisesUUI,excludingtripletsinvolvingthesameusers,e.g.,u;u;i.Eachtripletu;v;iisassociatedwithtwoquantities(modeledasrandomvariables):uvianduviwhichareessentialtoourprobabilisticmodeling.Thevariableuviisreal-valued.Itrepresentstheindicatorofagreementbetweenandon,someofwhichareobservedinthedata.Thecloseruviisto0,themorelikelyitisthatandagreeon.Ifuvi0oru

10 vithendisagreementismorelikely.uvicanbee
vithendisagreementismorelikely.uvicanbeexpressedasafunctionofratings,i.e.,uvi;rvi).Whiletherearemanypossiblede\fnitionsof,inthispaper,wesimplyusetheratingdi erencebetweentwousersonthesameitem,asshowninEquation3.Thischoiceoffunctionalsoimpliesthesymmetryofuvivuiuvivi(3)Thesecondvariableuvi2Yisbinary.uvi=1representstheeventofagreementbetweenandontheirpreferenceforuvi=0istheeventofdisagreement.Theseeventsarelatent,andneverobserved.Theyaretobeestimatedfromtheobserveduvi's.Thecloseruviisto0,themorelikelyweexpectuvi=1.Thefurtheruviisawayfrom0,themorelikelyweexpectuvi=0.ProblemFormulation.Givenratingsdata,andtheaboveuvide\fnition,weseektoestimatetheprobabilityuviuvi)foralltriplets.Notalluvi'scanbeobserved.uviisnotobservedifeitherorvi.Thisgivesrisetotwosub-problems.The\frstishowtoestimateuviuvi)giventheobserveduvivalues.Thesecondsub-problemishowtopredicttheun-observed^uvivalues.Forthe\frstsub-problem,weproposetheprobabilisticCAMmodelinSection4.Sinceuviislatent,itisnotpos-sibletoemploydiscriminativemodeling.Wethereforeturntogenerativemodeling,byrepresentinguviasarandomvariable,whosegenerativeprocessisrelatedtouvi.OurapproachisthustomodelthejointprobabilityP(uvi;xuviTheconditionalprobabilityP(uviuvi)canafterwardsbeestimatedfromthejointprobabilitiesasfollows:uviuvi)=uvi;xuvi uviuvi;xuvi(4)Thesecondsub-problemishowtopredicttheunseen^uviWewillthenusethepredicted^uviwiththeparameters 317 ovie atingby atingby alyseon oungchinq aranormalActivity 5 5 ayback 3 3 raline 5 5 an'sLabyrinth 5 5 emento 5 4 ranTorino 5 4 eHurtLocker 5 4 rassicParkIII 3 2 ilight 3 1 nception 5 3 redevil 3 1 AmLegend 4 2 osemary'sBaby 5 2 eDayAfterTomorrow 4 1 00 4 1 oulinRouge 5 2 evenPounds 4 1 eDarkKnight 5 1 eLastSamurai 5 1 tarWarsEpisodeIII:RevengeoftheSith 5 1 Table1:Epinionsuserstalyseonandyoungchinqneedtopayattentionto\context",arguingthatwhileapairofusersmayagreeintheirpreferencesinone\context",theymaydisagreeinadi erent\context".Therearemanywaystode\fne\context".Forinstance,theproductcategoryorthetimeofdaycouldeachbeaspeci\fccontext.However,thesede\fnitionsassumethepresenceofadditionalinforma-tioninthedata.Toretainthemostcommonframeworkintheliterature,whichistorelyonratingdataalone,inthispaper,by\context",werefertoeachspeci\fcitem.Inotherwords,weareinterestedinthecontextualagreeementofpreferencesbetweentwousersinthecontextofoneitem.Toillustratethismoreclearly,weuseareal-lifeexamplefromEpinions.com,anonlineratingsiteforvariousprod-ucts,e.g.,movies(usedinthisexample).InTable1,weshowtheratingpro\flesoftwousers:talyseonandyoungch-inq,ontwentymoviesthatbothofthemhadrated.Therat-ingsarefrom1(low)to5(high).Thetraditionalapproachofsharedpreferenceistousetheseratingstomeasuretheoverallsimilaritybetweenthetwousers.UsingPearson'scorrelation,theirsimilarityis0.53.UsingCosinesimilarity,theirsimilarityis0.88.(SeeSection2forthede\fnitionsofthesemeasures.)ThesesimilaritiesareconsideredhighasPearsonrangesfrom-1to1,andCosinerangesfrom0to1.Ononehand,thetwousersdosharesomepreferences.Thetopfewmoviesinthelistaremoviesthatbothassignhighratingsto,whichtendtobedramasandthrillers.Ontheotherhand,asinglesimilarityvaluecannotrevealthefullpictureoftheirpreferencesharing.Thelastfewmoviesinthelistarethosethattalyseonlikesbutyoungchinqdis-likes.Thesetendtobefantasytypes(e.g.,StarWarsIII,DarkKnight).Therefore,agreementonpreferenceshouldbeseeninthecontextofindividualitems.Forinstance,wesaythattalyseonandyoungchinqagreeontheirpreferenceinthecontextof\ParanormalActivity"movie.However,theydisagreeinthecontextof\TheLastSamurai"movie.Problem.Givenasetofusers(e.g.,),asetofitems(e.g.,),andsomeratingsbyusersonitems(e.g.,),weseektomodelthecontextualagreementbetweenapairofusersandonaspeci\fcitem(collectivelydenotedasatripletu;v;i).Insteadofjustanothersimilaritymeasure,wemodelthiscontextualagreementasaprobabilitymea-sure,withabinaryrandomvariableuviwithtwooutcomes:agreement(uvi=1)ordisagreement(uvi=0).Onekeyobservationisthattheobservedratingvalues,e.g.,andvi,providesignalsoftheagreementordis-agreementbetweenandonitem.Torepresentthismoresuccintly,wederiveaquantityuvi,whichisafunctionofandvi,i.e.,uvi;rvi),throughsomefunction(tobede\fnedlater).OurproblemcanthusberestatedasestimatingtheprobabilityofagreementP(uvi=1uviThisgivesrisetotwosub-problems.The\frstistheprob-abilisticmodelingofP(uvi=1uvi).Thisisakintoprob-abilisticclustering,wherebyweseektodecidethelatent\class"uviusingthe\feature"uvi.Therefore,weadoptagenerativemodelingbasedonGaussianmixtures,whichhasbeenappliedtootherunsupervisedclusteringproblems.WecallthismodelContextualAgreementModelorCAMThesecondsub-problemisthatnotalluvi'sareobserved,arisingdirectlyfromnothavingobservedallpossibleratings.For\unseen"triplets,whereeitherorviisunobserved,weneedtopredict^uvi(thehatindicatespredicted,ratherthanobserved).Forthis,weadopttheframeworkofmatrixfactorization.Thekeytoourapproachistheminimizationofanovelobjectivefunctionbasedonoptimizingforagree-ment,ratherthanrating,prediction.WecallthismethodDi erentialProbabilisticMatrixFactorizationorDPMFApplication.Theprobabilityofcontextualagreementallowsforabetterestimationofthecontextualsimilaritybetweenapairofusersonaspeci\fcitem.Thiswillcomeinusefulinseveralpotentialapplications.First,aswewillexploreinSection3,theagreementprobabilitycanbeusedincalibratingthesimilaritiesbetweenneighborsinanitem-speci\fcmannerforaneighborhood-basedrecommendersys-temtoderivearatingprediction.Second,itcansupportamoretargetedsocialrecommendation.Whenauserwantstorecommendanitemtoherfriends,insteadofsharingwithallfriends,thecontextualagreementprobabilitycaniden-tifythesubsetoffriendsmostinagreementontheitem.

11 Third,themodelmaybeusefulinastudyofpreva
Third,themodelmaybeusefulinastudyofprevalenceofagreementindi erentcommunities,productcategories,etc.Scope.Whileourworkisrelatedtorecommendersys-tems,ourfocusisonmodelingpreferences,andnotonrat-ingprediction.Thereadermayalsosurmisethatasimilarcontextualagreementframeworkmayapplytotripletsin-volvingauserandtwoitems.Thisisindeedthecase,buttomaintainfocus,wewilldiscussonlytripletsinvolvingtwousersandanitem.Asinput,weassumeonlyratingsdata,andnototherinformationsuchascategoriesorontologies[28].Wealsoassumethatratingsaretruthfulandre\rectiveofuserpreferences(andnotartefactsofdishonestyorfraud[6]),whichwebelieveistrueforavastmajorityofusers.Contributions.First,asfarasweknow,wearethe\frsttoproposemodelingitem-speci\fccontextinestimatingtheagreementbetweenapairofusersonanitem.Sec-ond,torealizethismodeling,inSection4,wedevelopaprobabilisticgenerativemodel,calledCAM,basedonGaus-sianmixtures.Weenforceamonotonicitypropertythatresultsinaspeci\fcparameterconstraint,anddescribehowtolearntheconstrainedparameterswithExpectationMax-imization.Third,toextendthismodeltounseentriplets,inSection5,weoutlinehowseveralmatrixfactorizationmeth-odscanbeapplied.Wealsoproposeanewmethod,called 316 ModelingContextualAgreementinPreferencesLocDoSchoolofInformationSystemsSingaporeManagementUniversityhldo.2012@phdis.smu.edu.sgHadyW.LauwSchoolofInformationSystemsSingaporeManagementUniversityhadywlauw@smu.edu.sgABSTRACTPersonalization,orcustomizingtheexperienceofeachin-dividualuser,isseenasausefulwaytonavigatethehugevarietyofchoicesontheWebtoday.Akeytenetofper-sonalizationisthecapacitytomodeluserpreferences.Theparadigmhasshiftedfromthatofindividualpreferences,wherebywelookatauser'spastactivitiesalone,tothatofsharedpreferences,wherebywemodelthesimilaritiesinpreferencesbetweenpairsofusers(e.g.,friends,peoplewithsimilarinterests).However,sharedpreferencesarestilltoogranular,becauseitassumesthatapairofuserswouldsharepreferencesacrossallitems.Wethereforepostulatetheneedtopayattentionto\context",whichreferstothespeci\fcitemonwhichthepreferencesbetweentwousersaretobeestimated.Inthispaper,weproposeagenerativemodelforcontextualagreementinpreferences.Foreverytripletcon-sistingoftwousersandanitem,themodelestimatesboththepriorprobabilityofagreementbetweenthetwousers,aswellastheposteriorprobabilityofagreementwithrespecttotheitemathand.Themodelparametersareestimatedfromratingsdata.Toextendthemodeltounseenratings,wefur-therproposeseveralmatrixfactorizationtechniquesfocusedonpredictingagreement,ratherthanratings.Experimentsonreal-lifedatashowthatourmodelyieldscontext-speci\fcsimilarityvaluesthatperformbetteronapredictiontaskthanmodelsrelyingonsharedpreferences.CategoriesandSubjectDescriptorsH.3.3[InformationSearchandRetrieval]:Information\fltering;H.2.8[DatabaseApplications]:DataMiningKeywordsuserpreference;contextualagreement;generativemodel1.INTRODUCTIONUsersfaceadizzyingarrayofchoicesforalmostanydeci-siontheymakeontheWebtoday,e.g.,whichmovietosee,CopyrightisheldbytheInternationalWorldWideWebConferenceCom-mittee(IW3C2).IW3C2reservestherighttoprovideahyperlinktotheauthor'ssiteiftheMaterialisusedinelectronicmedia.April7–11,2014,Seoul,Korea.ACM978-1-4503-2744-2/14/04.http://dx.doi.org/10.1145/2566486.2568006.whichbooktopurchase,whichjobtopursuenextandwhen[40],whichtagtouse[26].Inexploringoptions,thelimitingfactorisoftennota ordabilityoravailability,butrathertheuser'stimeandattention.ManyWebplatformsdealwiththisscarceresourcethroughpersonalization,byfocusingtheuser'sattentiononthingsmostlikelytobeofinterest.Inordertoprovideapersonalizedexperiencetoeachuser,we\frstneedtoknowtheuser'spreferences.Tosomeex-tent,thisisprovidedbytheuser'sownpastactivities,suchaswhichFacebookpostsshelikedordisliked,orwhichprod-uctsonAmazonshepurchased,orwhichmovieonNet\rixsheratedhighorlow.However,thesepreferencesignalsaretoosparse.Theyareexpressedoveraverylimitednumberofitems.Forinstance,mostNet\rixuserswouldhaveassignedratingstoonlytensofmovies,ascomparedtothethousandsofavailablemovies.Hence,thereisaneedtoextrapolatefromthesesignalstobuildamoregeneralpreferencemodel.Mostofthepreviousworkinthisareafocusonmodelingindividualpreferences.Theaimistoderiveuser-speci\fcmodelsfrompreferencedata,e.g.,ratings,whichwillhelppredictfutureadoptionsorratingsbytheuser.Thereareseveralwell-knownclassesofmethods,suchasaspectmodel[8,9],matrixfactorization[15],andcontent-basedmodel[1,31,21](seeSection2foramoreexpansivereview).Thesemethodsarestillactivelybeingresearched,andtheiruseisprevalentinindustrialrecommendersystems.Beyondindividualpreferences,thereisalsoasigni\fcantbodyofworkonthecomplementaryissueofsharedprefer-encebetweenpairsofusers.Forinstance,neighborhood-basedcollaborative\flteringsystems[12,33,5]predictauser'sratingonanitemastheweightedaverageofratingsbyherneighbors.Here,neighborsareotheruserswithhighsimilarityinpreferences,andtheweightsareproportionaltothedegreesofsimilaritybetweenapairofusers.Sharedpreferenceisuseful,becausesomeindividualsmaynothaveestablishedasucientlylongrecordofactivities(e.g.,rat-ings)forareasonablyaccurateindividualmodeltobebuilt.However,thelimitedrecordmayalreadybesucienttoin-ferhersimilaritytoanotheruserwithalongerrecordormoreaccuratemodel,whichcanthenbe\borrowed"tohelpinthepredictionsfortheformeruser.Alternatively,theremaybeextrainformation,e.g.,socialnetwork,toinducepreferencesharingbetweenfriends[24,22].Whilesharedpreferenceishelpful,italsomakestheim-plicitassumptionthatthesimilaritybetweentwousersap-pliesequallytoallitemsunderconsideration.Morerealisti-cally,usershavediversepreferences.Evenasimilarpairofusersdonotagreeatalltimes.Wethereforepostul