/
Finding Progression Stages in Timeevolving Event Sequences Jaewon Yang Julian McAuley Finding Progression Stages in Timeevolving Event Sequences Jaewon Yang Julian McAuley

Finding Progression Stages in Timeevolving Event Sequences Jaewon Yang Julian McAuley - PDF document

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
551 views
Uploaded On 2014-12-25

Finding Progression Stages in Timeevolving Event Sequences Jaewon Yang Julian McAuley - PPT Presentation

stanfordedu Biomedical Informatics Stanford University plependu nigamstanfordedu ABSTRACT Event sequences such as patients medical histories or users se quences of product reviews trace how individuals progress over time Identifying common patterns o ID: 29256

stanfordedu Biomedical Informatics Stanford University

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Finding Progression Stages in Timeevolvi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Thus,inordertounderstandthetemporaldynamicsofeventse-quencesitiscrucialtosolvetwotasks.Thersttaskrequiresustoidentifydifferentstagesthroughwhichsequencesprogressandthensegmentindividualsequencesaccordingtothediscoveredstages.Thesecondtasksrequiresustomodeldifferentcategoriesorclassesofsequences.Thatis,tomodelclassesofsequencesthatevolveaccordingtodifferentpatternsofevents.However,wehavetoconsiderbothtaskssimultaneouslyinordertobettercapturethediversitypresentinrealsequencedata.Modelsforidentifyingprogressionstagesareusefulwhensolv-ingavarietyoftasks.Firstly,withrichermodelsoftemporaldy-namics,wearebetterabletopredictfutureevents,suchasthenextproductapersonwillconsume,orthenextsymptomapatientwillexhibit.Secondly,thestagesandcategoriesthatwediscovermaythemselvesbemeaningful.Forinstance,suchmodelscanhelpustopredictapatient'sdiseasestagemoreaccuratelythanispossiblebyexaminingtheirsymptomsinisolation.Modelingthesetypesoftemporaldynamicsisafundamentallydifcultproblemforavarietyofreasons.Firstly,noteveryindi-vidualsequenceevolvesatthesamerate.Inaddition,noteverysequencewillfollowthesameprogressionpathorevenprogressthroughthesamesetofstages.Moreover,datamayonlybepar-tiallyobserved,e.g.,ourrstobservationofapatient'ssymptomsmayoccuronlyaftertheyalreadyexhibitadvancedsymptoms.Fi-nally,astheremaybemanydifferenttypesofprogression,se-quenceshavetobebothindividuallyclassiedaswellassegmented.Avarietyofmethodsexisttomodeleventsequences,thoughtheabovecomplicationspresentissuesformanyexistingmodels.Ap-proachesbasedonminingfrequentlyoccurringsubsequences[2,18,26]arenotappropriateforthistask,asthelevelofnoiseandlargestate-spacesmeanthatanyspecicpatternisextremelyun-likelytoappearrepeatedly.HiddenMarkovModelscapturehowlatentstateschange[5,7,19,22],thoughtheytypicallyassumethatallsequencessharethesamesetoflatentstatesandthusprogressinthesameway.Othermodelsoftime-varyingdata,whicharestate-of-the-artfortaskssuchasmovierecommendationonNetix[9,15],generallyassumethatusersevolveaccordingtoa“globalclock,”,i.e.,theirprogressionistiedtothecalendardate.Incon-trast,modelingpatientrecords(forexample)requiresthateachin-dividualprogressesaccordingtohisorherownpersonaltimescale.Itisbecauseoftheseabovedifcultiesthatanewmodelisrequired.Presentwork:Findingprogressionstages.Inthispaper,wecon-siderabroaddenitionofcategoricaleventsequences:Atabasiclevel,wemodelorderedsetsofeventsdrawnfromanitevocab-ulary.Thislevelofgeneralityallowsustomodeldatafromava-rietyofsources,includingproductreviews,browsinglogs,mediastreams,andmedicalrecords.Wedevelopscalablemethodstodiscovernaturalpatternsofpro-gressionintime-evolvingcategoricalsequencedata.Weachievethisbygroupingsequencesintodifferentclasses,basedoncommontemporalpatternsofevents,whilealsoindividuallysegmentingse-quencesintoautomaticallydiscoveredprogressionstages.Bothofthesetasksareperformedaspartofasingleoptimizationpro-cedure,sothatwesimultaneouslylearnthecategoriesandiden-tifytheprogressionstagesofindividualsequences.Ourmodelishighlyexibleintermsofhowindividualsequencesprogress—forinstance,noteverysequenceneedstoprogressthrougheverystage,andeachindividualsequencemayprogressthroughstagesatadifferentrate;thisexibilityisessentialtocapturethenoiseandvariabilitypresentinrealdata.Forexample,Figure1illustratestheoutputofouralgorithmwhenappliedtohumanbrowsingtracesonWikipedia.Wediscovertwosequenceclasses:peopletryingtoreachpagesaboutastron- (a)Inputsequences (b)OutputFigure2:Problemdenition:Giveninputeventsequences(a),weaimtocategorizesequencesintoclassesbasedonhowtheyevolve,andwedivideeachsequenceintoprogressionstages(b).omyandpeoplenavigatingtowardsU.S.-relatedpages.Simultane-ously,wediscoverfourstagesofbrowsingbehaviorthroughwhichusersprogresswhennavigatingtowardsaparticularwebpage.Morebroadly,weapplyourmodelstoreal-worldeventsequencesfromavarietyofsources.Wemodelpeople'sconsumptionpatternsonproductreviewwebsites,suchasRateBeer.com;wemodelpeo-ple'sbrowsingbehaviorusinglogdatafromWikispeedia(agamethatrequiresuserstonavigateWikipediapages[25]);andweapplyourmodelstomedicaldataofpatientswithchronickidneydisease.Intermsofexperiments,wefocusonthreedifferentaspects:pre-dictingindividualevents,inferringprogressionstages,andgroup-ingsequencesintoclasses.Foreachaspect,weaddqualitativeanalysistoshowthatourmodelshelpustobetterunderstandandreasonaboutthetemporaldynamicsofsequencedata.First,weevaluatetheabilityofourmethodtopredictfutureeventsinse-quencedata,e.g.,thenextproductapersonwillconsume,thenextpagethatshewillnavigateto,orthenextsymptomthatshewillex-hibit.Weobservethatourmethodachievesa30%gaininaccuracycomparedtoexistingmethodsforfuture-eventprediction.Second,weevaluatetheaccuracyandusefulnessofthestagesthemselves,whichwedobycomparingthemtoknownprogres-sionstagesofpatientswithchronickidneydisease.Theevaluationshowsthatourmethodcancorrectlyestimateatwhatstageasymp-tomwillappear,witharankcorrelationhigherthan0.8.Wealsoanalyzethestagesthatweinferinotherdatasetsandobservethatthespeedatwhichasequenceprogressesbetweenstagessignalsthelongevityofthesequence.Forexample,reviewerswhoad-vancetooquicklyortooslowlytendtoproducefewerreviewsintotalcomparedtothosewhoadvancemoderately.Third,inourqualitativeanalysis,wendthattheclassesofse-quencesthatwediscoverfromnavigationtrajectoriesonWikipediacorrespondtodifferentnavigationstrategies.Wealsodiscoverthatnewusersonproductreviewwebsitesinitiallyconsumesimilarproducts,beforegradually“fanningout”anddevelopingtheirowntastes,andthennallyconverginguponcommonsubsetsofprod-uctsfavoredby“experts”inthecommunity.Theremainderofthispaperisorganizedasfollows.InSection2weproposeourmodel.Section3describesthedata.Section4showsourexperimentsoneventprediction,Section5discussesexperimentsonprogressionstages,andSection6presentsexperi-mentsonclassesofsequences.Wediscussrelatedworkandcon-cludeinSections7and8.2.PROPOSEDMETHODOurgoalinthispaperistodiscoverthestagesofprogressionthatarecommontoagivensetofeventsequences.Toachievethisgoal,wedevelopamethodbasedonaconceptualprobabilisticmodel,whichspecieshowobservedeventsequencesaregeneratedfromlatentstages.Weformulatetheproblem,developthegenerativemodel,andthenshowhowthelatentstagesofthismodelcanbeefcientlylearned. 2.1ProblemDenitionWebeginbydeningtheproblemofndingprogressionstages.Weassumethatwearegivenasetofeventsequencesofdifferentlengths,andweaimtoinfertheirprogressionstagesandclasses.Ourproblemformulationisbasedonthefollowingintuitions.First,eacheventsequenceprogressesthroughasetoflatent,discrete-valued“stages”overtime,andobservedeventsaregen-erateddependingonthesequence'scurrentstage.Second,notonlydoeseachsequencehaveadifferentlength,butthedurationofprogressionstagesforeachsequencecanbesubstantiallydiffer-ent;somestagesprogressslowly,whileothersdosomorequickly.Moreover,sequencesmaynotprogressthroughallstages,i.e.,theymaystartandnishatsomeintermediatestage.Thenalintuitioninourproblemisthatforanysetofeventsequences,therearemultiplepossiblepatternsofprogression.Tomodeldifferenttypesofprogression,weassumethattherearela-tentclassesorcategoriesofeventsequences,wheresequencesbe-longingtothesameclassprogressthrougheventsinasimilarway.Wethenaimtoautomatically“cluster”or“group”sequencestoidentifysuchcommonpatternsofprogression.Inthisway,wede-velopanunsupervisedapproachtoclusteringsequenceprogressiondata,byidentifyingsetsofeventsequencesthatfollowcommontrajectories.Weformulatetheproblemofeventsequencesegmentationandclassicationasfollows:PROBLEM1.Givenasetofeventsequences,theproblemofsequencesegmentationandclassicationisto:ndtheclassthateachsequencebelongsto;andassigneacheventtoastage,withstageassignmentsbeingnon-decreasingovertime.WeillustratetheprocessinFig.2,anddescribeitindetailbelow.2.2ModelDescriptionHere,wedescribethegenerativeprocessthatwedevelopformodelinghowobservedeventsequencesaregeneratedfromasetofunderlyinglatentprogressionstages.Wedenoteeacheventsequencebyxi;i=1;:::;N,andthej-theventofxi(jorderedbytime)byacategorical-valuedxij2f1;:::;MgwhereNisthenumberofsequencesandMisthenumberofpossibleevents.Eachsequencemayhaveadifferentlength;wedenotebyLthesumofthelengthsofallsequences(i.e.,L=Pijxij).WealsoassumethatthereareCclassesofsequencesandthateachclassdividedintoKstages.WeassumeforsimplicitythatallclasseshaveKstages,thoughourmodelcaneasilybeadaptedtoaccommodateadifferentnumberofstagesperclass.Eachsequencexibelongstoasingleclassci2f1;:::;Cg.Foreacheventxij2xi,wedenesij2f1;:::;Kgtobethestageofthesequencexiattimej.Stagessijareanon-decreasingfunctionoftime,i.e.,asequenceneverprogresses“backward”.8i;j;kjk)sijsik:(1)Fromamodelingperspective,thisconstraintmeansthatwecap-turepatternsoftemporalevolutionthatrelatetothesequencesofevents,butarenottiedtotheexacttime,oroveralltrendsinthedataset.Alsonotethatwedonotrequirethatanysequencexishouldprogressthroughallstages:somesequencesmaybeginfromintermediatestages,whilesomeothersequencesmayendwithoutreachingthelaststage.Givenclassciandstagessijforsequencexi,wenowspecifyhowindividualeventsxijaregenerated.Weemployaverysim-plegenerativeprocesswherexijisindependentlydrawnfromamultinomialdistributionwithparameter(ci;sij)2RM:xijMultinomial((ci;sij))Here,each(ci;sij)representsseparatedistributionofeventsforagivenstagesijandclassci.Thisway,wecanensurethatse-quencesfromthesameclassshouldhavesimilarsetsofeventsdur-ingthesamestage.Last,wealsoassumethat(ci;sij)isdrawnfromauniformDirichletdistributionwithahyperparameter:(ci;sij)Dirichlet():Notethatourapproachcanbegeneralizedforgeneratingxijwithmoresophisticatedmodels,e.g.,wecouldmodelxijineachstageandeachclassusingLatentDirichletAllocation(LDA)[4].How-ever,wefoundthatoursimplemultinomialprocessworksreliablyinpracticeandallowsustotthemodelveryefciently.2.3InferringProgressionStagesWenowexplainhowwecanlearnthestagesofprogressionineventsequencesbasedonourmodel.Wearegivenasetofeventsequencesfxig.WealsoassumethatwearegiventhenumberofclassesCandthenumberofstagesK.(WewillexplainlaterhowtoestimatevaluesforCandK.)Ourgoalistolearnforeachse-quencexitheclassmembershipciofthesequenceandthestageassignmentssijforeacheventxijinthesequence.Weachievethisbyttingthemodel,i.e.,wendclassesci,stagessij,andmulti-nomialdistributions=f(p;q)jp=1;:::;C;q=1;:::;Kgbymaximizingtheloglikelihood:l(;fcig;fsijg;fxig)=logP(fxigj;fcig;fsijg)Becausevariablesxijareconditionallyindependentofeachothergivenfcig;fsijg,thelog-likelihoodbecomeslogP(fxigj;fcig;fsijg)=Xi;jlogP(xijj(ci;sij)):Thus,weaimtosolvethefollowingoptimizationproblem:argmaxfcig;fsijg%j;Xi;jlogP(xijj(ci;sij))(2)wherefsijg%jisthemonotonicityconstraintinEq.1.OptimizingEq.2jointlyforallsetsofvariablesishighlychal-lenging,sincetheproblemiscombinatorialandnon-convex.WenotethatourformulationcanbenaturallycastintheframeworkofExpectation-Maximization(EM),wherewecomputesoftassign-mentsofthestagesandtheclassesatonestep,andthenupdateusingthesesoftassignments.WenotethatwehaveexperimentedwiththeEMalgorithmandfoundthatEMconvergesprohibitivelyslowly.Thusweemployacoordinateascentstrategy,whichis1,000timesfasterthanEMinourexperience,whileyieldingre-sultsofsimilarquality.Ourcoordinateascentstrategyisdescribedbelow.AsillustratedinFigure3,weiterativelyupdatesubsetsofvari-ables.First,weupdatewhilekeepingfcigandfsijgxed(Fig.3(a)).Second,weupdatefcigandfsijgwithxed(Fig.3(b)).Weiteratethesetwostepsuntilconvergence,i.e.,untiltheclassesandthestagesthatwelearndonotchangebetweensucces-siveiterations.Updating.Withstagesfsijgandclassesfcigxed,weaimtondparametersthatmaximizethelog-likelihoodl()= Figure5:Cross-validationlikelihood(andstandarddeviation)versusthenumberofstagesKinthemedicalhistoryofthepatientswithchronickidneydisease,whenwexthenumberofclassesC=2.ThelikelihoodindicatesthatK=5isoptimal.set,i.e.,wemeasurethefollowingcross-validationlikelihood:Xi;j2ItlogP(xijj(ci;^sij))where^sijisthestageoftrainingeventclosesttoxij.Algorithminitialization.Beforeexecutingouralgorithm,wemustchooseinitialvaluesforstageandclassassignments.Toinitializefcig,wedividesequencesxiintoCdifferentclassesuniformlyatrandom.Toinitializefsijg,wespliteachsequencexiintoKstagesatuniformintervals,i.e.,foreachsequencexi,wesetsij=1fortherstjxij=Kevents,sij=2forthenextjxij=Kevents,andsoon.Ourmethodalsoincludesasinglehyper-parameter.Weconsidered2f1;0:1;0:01;0:001gandfoundthat=1givesreliableperformanceacrosseverydatasetthatwetried.Wenotethatourttingprocedurecanbeeasilyparallelized.Up-datingcanbedoneinparallelforeachclassandstage,andupdat-ingstagesandclassescanbeparallelizedforeachsequence.Usingparallelizationwith20threads,ourmodelcouldbetonourlargestdataset(RateBeer)of2milliontotaleventswithintwominutes.EMalgorithm.Last,webrieymentionthatwealsoexperimentedwithanExpectation-Maximization(EM)procedure[20]tolearn.Becausexijisgeneratedfromamultinomialdistribution,themax-imumlikelihoodestimateforcanbecomputedinclosedform:(p;q)r=+Pi;jQij(p;q)1fci=p^sij=q^xij=rg M+Pi;jQij(p;q)1fci=p^sij=qgwhereQij(p;q)isaposteriorprobabilityP(ci=p;sij=qjxi)thatxijwouldbelongtoclasspandstageq.Thisposteriorproba-bilityP(ci=p;sij=qjxi)canbecomputedefcientlyusingtheForward-Backwardalgorithm[20].WeimplementedtheEMalgorithmandcomparedittoourco-ordinateascentmethod.TheEMalgorithmrequireslongertocon-verge,butitultimatelyyieldsresultssimilartoourcoordinateas-centmethod.EMtakesmorethan1,000timesaslongtoexecute.Forexample,ittakestwodaysforEMtonishfortheRateBeerdataset,whereasourmethodtakesjusttwominutes.Thus,wefocusonthecoordinateascentapproachfortheremainderofthispaper.3.DATASETDESCRIPTIONForourexperiments,weconsidervedifferenttime-evolvingeventsequencesrangingfromelectronicmedicalrecordstoonlineproductreviews.Wedescribethedatasetsweconsiderandthedef-initionofeventsequencesineachdataset.Table1providesthesummaryofourdatasets.Productreviews.First,weconsideronlineproductreviewsfromtwolargebeer-reviewingcommunities(BeerAdvocateandRate-Beer)[16].Thesedatasetscontainallreviewsfromtheincep-tionofthesites(1998and2000,respectively)until2011,con-taining1,586,614reviewsfrom33,387users(BeerAdvocate),and2,924,127reviewsfrom29,265users(RateBeer).Weconstructaneventsequenceforeachuserfromthelistofbeersthattheyre-viewedinchronologicalorder.Inthisway,asequencerepresentshowuserschooseproducts(beers)astheydeveloptheirowntasteandgainmoreexperience.Sinceitisunlikelytobefruitfultomodeltheprogressionofuserswhohaveratedonlyafewproducts,wediscarduserswhohavewrittenfewerthan50reviews.Forasimilarreason,wediscardbeers(whichareindividualeventsinoursetting)thathavebeenreviewedbyfewerthan50users.Overall,weconsider1,084,816reviewsfrom4,432usersinBeerAdvocate,and2,016,861reviewsfrom4,584usersinRateBeer.Textualmemes.Ourseconddatasetconsistsofquotedphrasesinnewsarticlesandblogposts,providedussingasystemcalledNIFTY[23].Foreachquotedphrase,NIFTYtrackswhichweb-sitepostedanarticlequotingthephraseatwhattime.Wetakethequotedphrasesfrom2012,amountingto2millionquotedphrasesfrom170,997websites.ThekeyideainNIFTYisthataquotedphraseisatextual“meme”,whichrepresentsthepropagationofaveryspecicpieceofinformation.Wedeneasequencetobeachronologicallistoftheonlinemediasourcesthatmentionedaspecicphrase,whichrepresentshowthememespreadsinonlinemediaspace.Inordertofocusonmemesthatdrewglobalattentionandtheroleofimportantmediasites,weonlyconsiderwebsitesthathavementionedatleast0.5%ofallphrases(10,000phrases)andphrasesthathavebeenmentionedbyatleast200websites.Thismeansthatweconsider1,578,853mentionsfor4,866phrases.Medicalrecords.Third,weconsiderelectronicmedicalrecordsofpatientsfromStanfordHospitalsandClinics,accessedviatheStan-fordTranslationalResearchIntegratedDatabaseEnvironmentre-prository[14].Thedatasetspans17yearswithdataon1.8millionpatientsincluding10.5millionclinicalnotes.Weprocessthedocu-mentsusingmethodsdescribedin[12]tocreatetuplesof(medicalterm,patient,timeoffset).Weconsiderpatientswhohavebeendi-agnosedwithchronickidneydiseaseatleastonce.Frommedicaltermscorrespondingtootherdisordersorsymptomsmentionedintherecordsofthesepatients,weconstructaneventsequenceofsymptomsforeachindividualwithadiagnosisofCKD.Wefo-cusonpatientswhohaveatleast50medicaltermsintheirhistory.Overall,weconsider393,334termsfrom1,835patients.Webnavigationtraces.Last,weconsiderWebnavigationtracesfromtheonlinegameWikispeedia[25],whereplayersaregiventworandomWikipediapagesandmustnavigatefromonetotheotherbyclickingasfewhyperlinksaspossible.Weregardeachtraceofagame(theWikipediapagesthattheplayervisited)asanindividualsequence.Inthisway,sequencesrepresenthowaWebsurfernavigatestoreachaparticulardestination.Wefocusongametracesthathaveatleastfourpagesandonpagesthatappearinatleast50gametraces,whichresultsinatotalof164,308pagevisitsfrom29,012games.Notethattheprogressionstagesinthesedatasetshavedifferentimplications.Inbeerreviews,progressionrepresentsusersgainingexperienceanddevelopingtheirowntaste[16];inNIFTY,progres-sionrepresentshowinformationgrowspopularandthenfades;andinpatientdata,itrepresentsthedevelopmentofdiseases.FinallyinWikispeedia,progressionrepresentshowtheplayersdeploydiffer-entnavigationstrategiesduringdifferentstagesofbrowsing.4.EXPERIMENTSONEVENTSGivensequencesofevents,ourmodelcaninfertheirunderlyingclassesandthestagesofprogression.Anexampleofourresultsis DatasetSeq.EventNLE(jxij)M BeerAdvocateUserProduct4,4321.1m244.85,161RateBeerUserProduct4,5842.0m440.09,459NIFTYPhraseMedia4,8661.6m349.4605PatientsPatientSymptom1,8350.4m214.3124WikispeediaPlayerWebpage29,0120.2m5.71,048 Table1:Thedenitionofsequencesandeventsinthedatasetsandthedatastatistics.N:Numberofsequences,L:Totalnum-berofevents,E(jxij):Averagelength(thenumberofevents)ineachsequences,M:Numberofdistinctevents.mdenotesamillion.showninFig.8,whereweshowthemostfrequenteventsateachprogressionstagefortwoclasses.Here,ourmodelprovidesasum-maryoftheprogressionoftwoclassesofbeerreviewersduringthreestagesonBeerAdvocate.Inthenextthreesections,weperformexperimentswithourmodel.Eachofthethreesectionsfocusesonthreedifferentaspects:indi-vidualeventsinthesequences,theprogressionstagesthatwelearn,andtheclassesthatwelearn.Ineachofourexperiments,wepro-videquantitativeevaluationrstandthenanalyzetheresultsquali-tatively.Wewillshowthatourmodelallowsustodiscoverpatternsandclassesoftemporalprogressionofonlinereviewers,informa-tiondiffusion,Webnavigation,anddiseases.Therstexperimentfocusesonpredictingmissingeventsusingourmethod.Weformulatethetaskofpredictingmissingeventsineventsequencesandevaluatetheperformanceofourmodelquan-titatively.Experimentalsetup.Tomeasuretheaccuracyofpredictingmiss-ingevents,wespliteachsequenceintoatrainingandatestset.Wethentthemodelusingeventsfromthetrainingsetandmeasurehowaccuratelythemethodcanpredicttheeventsthatappearinthetestset.Notethatthiscanbeseenasamulti-labelpredictionprob-lemwhereMdistinctlabelsexist.WefocusontheaccuracyofpredictionswhenweconsiderthekmostprobableoutcomesOij,(Oijj=k)foreachmissingeventxijinthetestsetT1 jTjXi;j2T1(xij2Oij)where1isanindicatorfunction.Weemploytwoschemestobuildourtestsets.Therstschemeistoconsiderthenal(mostrecent)fewevents;thisschemeevaluatestheabilitytopredictfutureeventsinthesequencesgiveneventsuptothepresent.Thesecondschemeistoselectarandomsampleofeventsfromeachsequence;thissettingcorrespondstothetaskofrecoveringmissingeventsthatmayhavehappenedinthepast.Predictingeventswithourmodel.Wedescribehowwecanpre-dicttheeventsintestset,i.e.,howtorecommendthetopkitemsusingourmodel.Theideaistoinferthestageandclassforeachtesteventandthenndthekmostlikelyitemsaccordingtothecor-respondingmultinomialdistribution.Inferringtheclassisdoneusingothertrainingeventsinthesamesequence.However,wecan-notinferstagesforeventsinthetestset.Thus,foreachtestevent,weassignitthestageofitschronologicallynearesttrainingevent.Baselines.Weconsiderthreebaselinemethodsformulti-classpre-dictionwhereweaimtopredicttheeventsinthetestsetgiventhetrainingevents.First,weconsidermulti-classlogisticregression,whichaimstopredictthelabelofmissingeventsusingtheobservedeventsasfea-tures.Whereasthetrainingeventsinthisproblemcontainjustlistsofevents,logisticregressionisasupervisedmethodthatrequires Figure6:Asabaseline,wetrainalogisticclassierusingtrain-ingevents.Wesplitthetrainingeventsinto“featureevents”and“responseevents”sothatlogisticregressionlearnstopre-dicttheresponseeventsgiventhefeatureevents.trainingexamplesthathavea“responsevariable”(label)andfea-tures.Thus,wedividethetrainingeventsinto“featureevents”and“responseevents”sothatlogisticregressionlearnstopredictthere-sponseeventsgiventhefeatureevents.Amongtrainingevents,wetreateventsadjacenttothetesteventsasresponseevents,andwetreattheothertrainingeventsasfeatureevents.Wethenconstructafeaturevectorfi2RMusingthefeatureevents,andwelearnMlogisticregressionclassiersforeachofMdistinctevents.fim=jfxijjxij=m;xij2featureeventsgj:Afterlearningthelogisticregressionclassiers,weaimtopredictthetestevents.Inthiscase,wetreatalltrainingeventsasfeatureevents.Wethenpickthetopklabelsbasedontheprobabilityre-turnedbylogisticregression.OursecondbaselineisaHiddenMarkovModel(HMM),asanexemplarofmodelsbasedonDynamicBayesianNetworks(DBNs).Aftertrainingthemodel,weestimatethelatentstateofthetesteventbychoosingthestateofthechronologicallyclosesttrainingevent.Then,wegenerateOijusingthekmostprobableeventsfromthatestimatedstate.Comparisonsagainstthismethodshowhowmuchbenetisobtainedbymodelingclassesofsequencesandbyassumingthatstagesincreasemonotonically;withouttheseadditions,ourmodelwouldbeequivalenttoanHMM.Ourthirdbaselinemethodisasimplerversionofourmodelwhereusersprogressatthesameratethroughthesamesetofstages.WecallthisbaselineModel-U.Weassumethatallsequencesbe-longtothesameclass,andforeachsequencexi,wesetsij=1fortherstjxij=Kevents,sij=2forthenextjxij=Kevents,andsoon.Usingthesestageassignments,wettheparameterforthemultinomialdistributionforeachstage.Comparisonagainstthismodelcapturestheeffectoflearningprogressionstagesindividu-allyforeacheventsequence.Experimentalresults.Table2showstheaccuracywhenpredict-ingthenaleventsinasequence,whereweoutputk=10mostprobableeventsforeachtestevent(i.e.,jOijj=10).Amongourthreebaselines,logisticregressionconsistentlyoutperformsalltheotherbaselines(Model-UandHMM)inalldatasets.Thus,tocon-servespace,weshowtheperformanceofourmethodandlogisticregression.Thelefttwocolumnsshowabsoluteaccuracies,whilethethirdandfourthcolumnshowtherelativeimprovementwhenwedividebytheaccuracyofrandomlyguessingoneofMval-ues(1=M).TheintuitionbehindrelativeimprovementisthattheoveralldifcultyofpredictiondependsonM,thenumberofpos-sibleeventvalues.Eventhoughthemethodsachievelowabso-luteaccuraciesinthebeerdatasets,ourresultsherearesigni-cantasourmethodperforms100timesbetterthanrandomguess-ing.Ourmethodoutperformslogisticregressiononfourdatasetsandachievesarelativegainof130.7onaverage,whichis32.4%higherthanlogisticregressionwhoseaveragerelativeperformanceis102.6.Notethatunlikelogisticregression,ourmethodisnotspecicallydesignedforclassicationorprediction;nevertheless,theprogressionpatternlearnedbyourmodelcanprovideawayto AbsoluteAcc. Relativetorandomguessing Gainoverbaseline(%)Method OursBaseline OursBaseline BeerAdvocate 0.0220.013 113.567.1 69.2RateBeer 0.0130.008 124.176.4 62.5Nifty 0.3380.297 204.5179.7 13.8Patients 0.5630.608 69.875.4 -7.4Wikispeedia 0.1350.109 141.5114.2 23.9 Table2:Performancewhenpredictingthemostrecenteventsofeventsequences.Methodsoutputthe10mostprobableevents.Wecomparetotheperformanceofthebestbaseline(logisticregression). AbsoluteAcc. Relativetorandomguessing Gainoverbaseline(%)Method OursBaseline OursBaseline BeerAdvocate 0.0300.014 154.872.3 114.3RateBeer 0.0220.009 210.185.9 144.4Nifty 0.2930.224 177.3135.5 30.8Patients 0.6720.676 83.383.8 -0.6Wikispeedia 0.2570.254 269.3266.2 1.2 Table3:Performanceofpredictingarandomsetofmissingeventsfromeventsequences.Methodsoutputthe10mostprob-ableevents.Wecomparetotheperformanceofthebestbase-line(logisticregression).predictthefutureevents(ormissingpastevents)ofthesequencesreliably.TheonlydatasetwhereourmodeldoesnotoutperformallbaselinesisthePatientsdataset.Apossibleexplanationisthatsomecommonsymptoms,suchas“effusion”,appearveryfrequentlyacrossallpatients,andlearningprogressionpatternswouldbelesshelpfulforpredictingsuchfrequentsymptoms.Table3showstheperformancewhenpredictingarandomsam-pleofevents.Again,ourmodeloutperformsthebestbaseline(lo-gisticregression)infourdatasets.Forpatientdata,ourmethodisonparwithlogisticregression.Onaverage,ourmodelyieldsarela-tiveimprovementof179.0,whichis58%higherthanwhatlogisticregressionachieves(128.7).Sinceouraccuracymeasureignoreshowaccuratelywerankthetopkpredictions,wetriedotherval-uesofk(k2f1;5;20g)forevaluation,yetwefoundqualitativelysimilarresultsincomparingourmethodagainstbaselines.5.EXPERIMENTSONSTAGESOursecondsetofexperimentsfocusesonthestagesofeventsthatweinferfromgiveneventsequences.5.1AccuracyofLearningStagesWebeginbyevaluatinghowaccuratelywecaninferprogressionstages.Forasetofevents,weextract“ground-truth”labelsforstagesofparticularevents.Wethenmeasurehowwellthestagesthatweinfercorrespondtotheseground-truthstages.Gatheringinformationforsuchground-truthstagesis,ingeneral,achalleng-ingtask;however,suchinformationisavailableformedicaleventsrelatedtochronickidneydisease,whichwestudyinthispaper.Experimentalsetup.Chronickidneydisease(CKD)hasvestages,whichareexplicitlydenedbythelevelofglomerularltration.OurdatacontainexpliciteventsabouttheCKDstageofpatients,suchas“chronickidneydiseasestagek”(k2f1;:::;5g).Us-ingsuchexplicitevents,wecanestimatetheground-truthstageofothermedicalevents(symptoms)bylookingattheco-occurrencebetweentheeventandthe“CKDstagek”events.Foreachsymp-tomeinourdataset,wemeasuretheposteriorprobabilityPe(k)thattheevent“CKDstagek”happenswiththeeventatthesame Score OursBaseline Kendall's 0.8100.659Pearsoncorrelation 0.447-0.007 Table4:Performanceonlearningtheprogressionstagesofchronickidneydisease.visit.Then,weestimatetheground-truthstageseofeventebytheposterioraveragevalueofk(i.e.,se=PkkPe(k)).Afterestimatingse,tworesearcherswithamedicaldegreevalidatedthevaluesbymanualinspection.ThersttwocolumnsinTable5showasampleoffoursymptomsandtheirground-truthstages.Giventhetrainingdata,ourmodelassignseacheventtoaspe-cicstage.Thus,wecomputetheaveragevalueofstageassign-ments^seforevente(i.e.,^se=E[sijjxij=e]).Wethencomputethecorrespondencebetweenground-truthstageseandthelearnedstage^seusingtwostandardmetrics:Kendall'sandthePearsoncorrelationcoefcient.Baselines.Asabaseline,weconsiderModel-UthatweconsideredinSection4,wherewesegmenteachsequenceintoKstageswiththesameduration.Tothebestofourknowledge,wearenotawareofexistingmeth-odsthatdiscoversuchinteger-valuedprogressionstages,whichal-lowustoestimateatwhatpointofprogressionaspeciceventwouldoccur.Existingmethodforlearninglatentstates[8,19]es-timatecategorical-valuedstagesofeventswherenoorderbetweenthestagesexists.Experimentalresults.Table4showstheperformanceofthemeth-ods.Inbothmetrics,weshowthatourmodeloutperformsthebaselineModel-U,whichshowsthatlearningindividualprogres-sionstagesbooststheaccuracyofinferringstagesofevents.OurmethodachievesaKendall's(i.e.,rankcorrelation)of0.8,whichmeansthatthestageslearnedbyourmodelpreservethecorrector-derforthemorethan80%ofthesymptompairs.GiventhatModel-Uachieves=0:659,weachievearelativeimprovementof23%.IntermsofPearsoncorrelation,theimprovementoverthebaselineisevenlarger,asthestageslearnedbythebaselinearenegativelycorrelatedwiththetruestages.WefurtherinvestigatetheresultsofourmodelandModel-U.Ta-ble5givesafewexamplesofsymptoms,theirground-truthstages,andtheestimatesbyourmodelandthebaseline.Notethatourmodel'sestimatesmatchground-truthstagesmuchbetterthanthebaseline.Forexample,forsecondarypulmonaryhypertension,which,inpractice,tendstohappenatanearlystage(stage2),ourmodelestimatesastageof2.65onaverage,whereasModel-Uestimatesahigherstageof3.45.Forhyperphosphatemiaandacidosis,whichhappenatalaterstage(stage4),ourmodelestimatesthecorrectstageveryclosely(3.99and3.97respectively),whilethebaselineestimatesdifferentstages,namely3.71and3.21(respectively).Giventhepoorperformanceofthebaseline,wenotethatse-quencescanhaveaverydifferentnumberofeventsateachstage,becausediseasesprogressatdifferentratesandwithinadiseasein-dividualpatientsprogressatdifferentrates.Ourresultsshowthatourmodelcansuccessfullylearnthenaturalhistoryofchronickid-neydiseasebycorrectingforsuchfactors.5.2RelationbetweenStageandSequenceLengthWeaimtogainsomeinsightonhowsequencesevolvebyanalyz-ingthestageswelearnedqualitatively.Inparticular,weexaminetherelationbetweenhowquicklyasequenceprogressesandthelengthofthesequence.Inotherwords,weaskthefollowingques- ofsequencesprogress.Wenowinvestigatetheclassesofprogres-sionthatwelearninmoredetail.Stage-wisesimilaritybetweenclasses.Wefocusonthesimilaritybetweenclassesasstagesprogress.Thatis,weaskthefollowingquestion:Assequencesevolve,dotheclassesconvergeandhavemorehomogeneousevents?Or,dotheydiversify?Tomeasurethis,weconductthefollowingexperiment.Foreachstages=1;:::;K,wemeasurethesimilaritybetweentwoclassesc1;c2byusingthesymmetrizedcrossentropyHs(c1;c2)[6]foreventsbelongingtostages:Hs(c1;c2)=H0s(c1;c2)+H0s(c2;c1)whereH0s(c1;c2)istheasymmetriccrossentropy:H0s(c1;c2)=Exijjci=c2;sij=s[�logP(xijj(c1;s))]:ThecrossentropyH0s(c1;c2)quantiestheuncertaintyifwede-scribetheeventsatstagesinclassc2usingthemultinomialdistri-butionforclassc1atthesamestages.Thesmalleritis,themoresimilarthetwoclassesaretoeachotheratstages.Fig.9showstheaveragecrossentropybetweenclassesateachstage.Fig.9(a)showsthattheentropyformsabell-shapedcurvewhosemaximumisatstage3.Productreviewersbeginwithsim-ilarproducts,andthendivergefromeachotherduringstages2,3,and4,whereusersdeveloptheirowntaste.Finally,theyarriveatsimilarsetsofproductsthatarefavoredbyexperts.Fig.8showstwoclassesthatwelearninBeerAdvocatethatfollowthispattern.Fig.8showsthetopsevenmostfrequentproductsthatwelearnatstages1,3,and5,wheretheclasseshavesomeoverlapatstage1,divergeatstage3,andnallyconvergeatstage5.OnWikispeedia,thecrossentropyyieldsaminimumatstage2andthenincreases.Highentropyatstage1isnatural,asgamesbeginfromrandomstartingpoints.Theminimumcrossentropyatstage2correspondsverywelltotheobservationfromexistinglit-erature[25]thatplayerstendtonavigatetoafew“hubs”intheirrstmove(i.e.,attheirsecondpage),beforemovingtomorespe-cicpagesdependingontheirdestination.Sinceplayersconvergetohubsastheirsecondpage,stage2exhibitstheminimumcrossentropy.Then,gametrajectoriesdiversifydependingonthetopicsofthedestinationpages.ThecrossentropypatternonWikispeediaisclearlyshownbythetwoclassesinFig.1(Sec.1),whichshowsthevemostfrequentpagesateachstagefortwoclasses.Frequentpagesatstage1arenotsimilartoeachother,asthegamesbeginfromarandompage.Atthesecondstage,playerstendtomoveto“hubs”,suchas“NorthAmerica”and“Europe.”Then,redplayersmoveto“astronomy”pages,whileblueplayersmovetoward“American”pages.Inthechronickidneydisease(CKD)patientsdata,thecrossen-tropytendstodecreaseasthestageincreases.ThissuggeststhatpatientsshowdiversesymptomsotherthanCKDduringitsinitialstages.However,patientstendtosharesimilarCKD-specicsymp-tomsasthediseasedevelops.InNIFTY,theclassesstayinparallelwithoutconvergingordiverging.Classesinonlinemedia.Sofar,weshowedtheclassesthatwelearnonWikispeedia(Fig.1)andonproductreviews(Fig.8).WenowinvestigatetheclassesofprogressionthatwelearnfromthephrasesquotedbyonlinemediainNIFTY.Weexaminedthese-quences(phrases)belongingtothesameclassandobservedthattheclassesthatwelearncorrespondtodifferenttopics,suchasPolitics,International,orEntertainment.Table7showsthetopvemostpopular(frequentlyquoted)phrasesintwooftheclassesthatwelearnedinNIFTY.Wecanobserveacleardistinctionbetweenphrasesaboutentertainment(top)andpoliticalphrases(bottom). (a)Beerreviews (b)NIFTY (c)Patients (d)WikispeediaFigure9:Averagecrossentropybetweentheclassesateachstage.Thecrossentropyshowsstage-wisedissimilaritybe-tweentheclasses. Class1 sodevastating.wewillalwaysloveyouwhitney,r.i.p jokersceneindarkknightrises thedailyshowwithjonstewart thegirlwiththedragontattoo snowwhiteandthehuntsman Class2 thisisonesmallstepforaman,onegiantleapformankind theybroughtuswholebindersfullofwomen theecbisreadytodowhateverittakestopreservetheeuro unchainwallstreet!they'regonnaputy'allbackinchains evidenceofcalculationanddeliberation Table7:TopvemostpopularphrasesintwoclassesthatwelearnontheNIFTYdataset.ThetopclasscorrespondstophrasesaboutEntertainment,andthebottomonecontainspo-liticalphrases.OurdiscoveryinTable7suggeststhatpoliticaltopicsandcul-turaltopicsarementionedbydifferentmediasitesinadifferentor-der.Wefurtherexaminethisbyndingthetopvemostfrequentevents(mediasites)ateachstage.Table8showstheresultsforstages1,3,and5,andalsoshowsthatphrasesaboutentertainmentarerstmentionedbyindependentmedia,thenbybroadcastingsta-tions,andthennallybynewspapers.Ontheotherhand,politicalphrasesarerstquotedbynewspapers,thenbybroadcastingsta-tions,andlastlybyforums.Classesofpatientswithchronickidneydisease.Wenallyex-aminetheclassesthatwelearnfrompatientswithchronickidneydisease(CKD).Weidentiedtwoclassesinthisdataset,withtheprimarydifferencebeingtherateofoccuranceofalbuminuria.Inoneclassthatwelearn,albuminuriaoccursextremelyrarely,withprobability=0:01%inanystage,whileintheotherclassal-buminuriahappensmuchmoreoften(=0:05%,whichisvetimeshigher).Ourndingscorrespondwellwithrecentndings[10],whichnotethatabout30%ofCKDpatientsdonotsufferfromalbuminuriacontrarytothecommonbeliefthatalbuminuriaisthehall-markofscreeningforandearlyidenticationofCKD.Inouranalysis,thefractionofpatientswithoutalbuminuriais663pa-tientsoutof1,835(36%),whichissimilartothatreportedin[10].ThenaturalhistoryofCKDprogressionwithoutalbuminuriaisrel-ativelyunknown,andisofactiveinterestinnephrologybecauseitcomprisesaninjurypatternwithoutclassicglomerulosclerosisand Figure8:Top7mostfrequentproductsintwoclassesatprogressionstages1,3,and5fromproductreviewsonBeerAdvocate. Initialstage Middle(third)stage Finalstage Class1 promiash.de news.yahoo.com startribune.com thewrap.com nbc.com examiner.com hufngtonpost.com abc.com latimes.com examiner.com entertainment.msn.com news.yahoo.com wonderwall.msn.com fox.com entertainment.msn.com Class2 news.yahoo.com abc.com townhall.com bing.com nbc.com freerepublic.com reuters.com cbs.com conservapedia.com guardian.co.uk kwes.com salon.com washingtonpost.com fox.com breitbart.com Table8:Topvemostfrequentsitesattheinitial,middle,andthenalstagesinthetwoclassesofNIFTYquotedphrasesinTable7.Entertainmentphrases(Class1)tendtogetmentionedbyindependentmediasitesduringtheinitialstage,byTVsta-tionsduringstage3,beforebeingmentionedbynewspapers.Ontheotherhand,politicalphrasesarementionedbynewspa-persduringtherststage,thenbyTVduringthemiddlestage,andnallybyforumsduringthenalstage.affectsanestimated0.3millionpeople[10].Weintendtoanalyzethispatientgroupfurtherinordertoelucidatealternativemakersthatindicateprogressionviaanon-albuminuriapath.7.RELATEDWORKAnalyzingtheprogressionofeventsequenceshasbeenattemptedinseveraldifferentsettings.Oneofthemostnotableapproachesisthatofepisodemining[2,11,17,18,26],whereoneaimstondsubsequencesofevents(episodes)thatmanysequenceshaveincommon.Becausesimplycountingoccurrencesofsubsequencesmayfavorthemostredundantones,thetaskrequirespruningtech-niques[2,18],measuresofsubsequenceimportance[13,26],orprobabilisticmodeling[11].However,therearetwodrawbacksintheseapproaches[8,24].First,frequentsubsequencesfocusonaverylimitedpartofsequencedata,astheydonottelluswhicheventstendtohappenafterorbeforethechosensubsequences.Second,countingsubsequencesissusceptibletoobservationnoise,whichmayresultinpartiallyobservedorslightlypermutedse-quences.Ratherthanrelyingoncounting,weapplyastatisticalapproachtomodelwholeeventsequences,whichallowsustosum-marizetheglobalpictureofsequences[8]whilebeingrobusttorandompermutationsinthedata.ThestatisticalmodelusedinourapproachisrelatedtoHiddenMarkovModels(HMMs)[5,7,19,22],whichassumethatob-servedsequentialdataarisesduetoasequenceofunderlyinglatentstates.HMMshaveproventobeeffectiveinavarietyofapplica-tions,includingtimeseriesclustering[22],eventprediction[11],andspeechrecognition[19].WhereasHMMsassumethatanytransitionbetweenlatentstatesispossible,weenforceaspecicstructureonthetransitionsinwhichstatesareconstrainedtoad-vancesequentially.Wealsointroduce“classes”ofstagessothatsequencesfromdifferentclassesmayevolvedifferently.Wenotethatenforcingsuchastructureinstatetransitionsiskeytosuccess-fullycapturingtheprogressionofeventsequences,andthatHMMswithoutsuchstructuralconstraintsfailtomodelthetypesofdataweconsiderinourevaluation(Section4).Furtherrelatedworkincludesmodelsoftemporallyvaryingma-trices[9,15].Forexample,[9]consideredtime-varyingbiastermstoimprovetheaccuracyofpredictingmovieratings.Inaddition,[15]developedamulti-leveltensorfactorizationapproachtocap-tureperiodictrendsinusers'Web-clickbehavior.However,thesemethodsdonotfocusontheindividualdevelopmentofsequences(i.e.,users)—thatis,howindividualsevolveastheybecomemorematureandgainmoreexperience.Inthiswork,weaimtoconsidersuchtemporalaspectsindividuallyforeachsequence.8.CONCLUSIONInthispaper,wedevelopedamodeltolearnpatternsofprogres-sionintime-evolvingeventsequencesbygroupingthembasedonhowtheyevolveandbysegmentingthemintoprogressionstages.Ourmethodcanprocesssequenceswithmillionsofeventswithinamatterofminutes.Experimentsshowthatourmethodcanreli-ablypredictthefutureeventsofsequences,accuratelysegmentse-quencesintoprogressionstages,andgroupsequenceswithsimilarpropertiesintothesameclass.Theprogressionstagesandclassesthatwelearninreal-worldsequentialdataprovidenewinsightsonhowproductreviewersdeveloptheirowntasteswhenchoosingproducts,howusersnavigatewebpages,andhowvarioustopicsarecoveredbydifferentsourcesofonlinemedia.Therearealsoseveralavenuesforfuturework.Forexample,itwouldbeinterestingtoconsidermoresophisticatedgenerativemodelsofevents[16].Onasimilarnote,allowingsequencestobelongtomultipleclasseswouldbeaninterestingextension.Fi-nally,ourmethoddiscoversnostructureamongthestagesofdif-ferentclasses,i.e.,eachclassevolvesindependentlyoftheothers;itwouldbeinterestingtoexplorewhetherthemethodcanauto-maticallyndtheoverlapsbetweenthestagesofdifferentclasses.Adaptingapproachesrecentlyproposedforextractingstructurefromonlinenewsmightbeparticularlypromising[21].Acknowledgements.ThisresearchhasbeensupportedinpartbyNSFIIS-1016909,CNS-1010921,IIS-1149837,IIS-1159679,AROMURI,ARLAHPCRC,OkawaFoundation,PayPal,Docomo,Boe-ing,Allyes,Volkswagen,andAlfredP.SloanFoundation. 9.REFERENCES[1]Y.Ahn,J.Bagrow,andS.Lehmann.Linkcommunitiesrevealmulti-scalecomplexityinnetworks.Nature,2010.[2]I.Batal,D.Fradkin,J.Harrison,F.Moerchen,andM.Hauskrecht.Miningrecenttemporalpatternsforeventdetectioninmultivariatetimeseriesdata.InKDD,2012.[3]L.Bergroth,H.Hakonen,andT.Raita.Asurveyoflongestcommonsubsequencealgorithms.InStringProcessingandInformationRetrieval,2000.[4]D.Blei,A.Ng,andM.Jordan.LatentDirichletallocation.JournalofMachineLearningReserach,2003.[5]E.Coviello,A.B.Chan,andG.R.G.Lanckriet.Thevariationalhierarchicalemalgorithmforclusteringhiddenmarkovmodels.InNIPS,2012.[6]C.Danescu-Niculescu-Mizil,R.West,D.Jurafsky,J.Leskovec,andC.Potts.Nocountryforoldmembers:userlifecycleandlinguisticchangeinonlinecommunities.InWWW,2013.[7]P.Felzenszwalb,D.Huttenlocher,andJ.Kleinberg.FastalgorithmsforlargestatespaceHMMswithapplicationstowebusageanalysis.InNIPS,2003.[8]J.KiernanandE.Terzi.Constructingcomprehensivesummariesoflargeeventsequences.ACMTransactiononKnowledgeDiscoveryfromData,2009.[9]Y.Koren.Collaborativelteringwithtemporaldynamics.CommunicationsoftheACM,2010.[10]H.Kramer,Q.Nguyen,G.Curhan,andC.Hsu.Renalinsufciencyintheabsenceofalbuminuriaandretinopathyamongadultswithtype2diabetesmellitus.TheJournaloftheAmericanMedicalAssociation,2003.[11]S.Laxman,V.Tankasali,andR.White.Streampredictionusingagenerativemodelbasedonfrequentepisodesineventsequences.InKDD,2008.[12]N.Leeper,A.Bauer-Mehren,S.Iyer,P.LePendu,C.Olson,andN.Shah.Practice-basedevidence:Prolingthesafetyofcilostazolbytext-miningofclinicalnotes.PLoSONE,2013.[13]M.LiuandJ.Qu.Mininghighutilityitemsetswithoutcandidategeneration.InCIKM,2012.[14]H.P.LoweH.,FerrisT.andW.S.STRIDE–anintegratedstandards-basedtranslationalresearchinformaticsplatform.AMIA,2009.[15]Y.Matsubara,Y.Sakurai,C.Faloutsos,T.Iwata,andM.Yoshikawa.Fastminingandforecastingofcomplextime-stampedevents.InKDD,2012.[16]J.McAuleyandJ.Leskovec.Fromamateurstoconnoisseurs:modelingtheevolutionofuserexpertisethroughonlinereviews.InWWW,2013.[17]D.Patnaik,S.Laxman,B.Chandramouli,andN.Ramakrishnan.Efcientepisodeminingofdynamiceventstreams.InICDM,2012.[18]J.Pei,H.Wang,J.Liu,K.Wang,J.Wang,andP.Yu.Discoveringfrequentclosedpartialordersfromstrings.IEEETransactionsonKnowledgeandDataEngineering,2006.[19]L.Rabiner.Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition.InProceedingsoftheIEEE,1989.[20]S.Scott.Bayesianmethodsforhiddenmarkovmodels.JournaloftheAmericanStatisticalAssociation,2002.[21]D.Shahaf,J.Yang,C.Suen,J.Jacobs,H.Wang,andJ.Leskovec.Informationcartography:creatingzoomable,large-scalemapsofinformation.InKDD,2013.[22]P.Smyth.Clusteringsequenceswithhiddenmarkovmodels.InNIPS,1997.[23]C.Suen,S.Huang,C.Eksombatchai,R.Sosic,andJ.Leskovec.Nifty:asystemforlargescaleinformationowtrackingandclustering.InWWW,2013.[24]N.TattiandJ.Vreeken.Thelongandtheshortofit:summarisingeventsequenceswithserialepisodes.InKDD,2012.[25]B.WestandJ.Leskovec.Humanwayndingininformationnetworks.InWWW,2012.[26]C.-W.Wu,Y.-F.Lin,P.Yu,andV.Tseng.Mininghighutilityepisodesincomplexeventsequences.InKDD,2013.