Blei BLEI CS PRINCETON EDU Computer Science Department Princeton University Princ eton NJ 08544 USA John D Lafferty LAFFERTY CS CMU EDU School of Computer Science Carnegie Mellon University Pi ttsburgh PA 15213 USA Abs ID: 11109
Download Pdf The PPT/PDF document "Dynamic Topic Models David M" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
DynamicTopicModelsDavidM.BleiBLEI@CS.PRINCETON.EDUComputerScienceDepartment,PrincetonUniversity,Princeton,NJ08544,USAJohnD.LaffertyLAFFERTY@CS.CMU.EDUSchoolofComputerScience,CarnegieMellonUniversity,PittsburghPA15213,USAAbstractAfamilyofprobabilistictimeseriesmodelsisdevelopedtoanalyzethetimeevolutionoftopicsinlargedocumentcollections.Theapproachistousestatespacemodelsonthenaturalparam-etersofthemultinomialdistributionsthatrepre-sentthetopics.VariationalapproximationsbasedonKalmanltersandnonparametricwaveletre-gressionaredevelopedtocarryoutapproximateposteriorinferenceoverthelatenttopics.Inaddi-tiontogivingquantitative,predictivemodelsofasequentialcorpus,dynamictopicmodelsprovideaqualitativewindowintothecontentsofalargedocumentcollection.Themodelsaredemon-stratedbyanalyzingtheOCR'edarchivesofthejournalSciencefrom1880through2000.1.IntroductionManagingtheexplosionofelectronicdocumentarchivesrequiresnewtoolsforautomaticallyorganizing,searching,indexing,andbrowsinglargecollections.Recentresearchinmachinelearningandstatisticshasdevelopednewtech-niquesforndingpatternsofwordsindocumentcollec-tionsusinghierarchicalprobabilisticmodels(Bleietal.,2003;McCallumetal.,2004;Rosen-Zvietal.,2004;Grif-thsandSteyvers,2004;BuntineandJakulin,2004;BleiandLafferty,2006).Thesemodelsarecalledtopicmod-elsbecausethediscoveredpatternsoftenreecttheunder-lyingtopicswhichcombinedtoformthedocuments.Suchhierarchicalprobabilisticmodelsareeasilygeneralizedtootherkindsofdata;forexample,topicmodelshavebeenusedtoanalyzeimages(Fei-FeiandPerona,2005;Sivicetal.,2005),biologicaldata(Pritchardetal.,2000),andsurveydata(Erosheva,2002).Inanexchangeabletopicmodel,thewordsofeachdocu-AppearinginProceedingsofthe23rdInternationalConferenceonMachineLearning,Pittsburgh,PA,2006.Copyright2006bytheauthor(s)/owner(s).mentareassumedtobeindependentlydrawnfromamix-tureofmultinomials.Themixingproportionsarerandomlydrawnforeachdocument;themixturecomponents,ortop-ics,aresharedbyalldocuments.Thus,eachdocumentreectsthecomponentswithdifferentproportions.Thesemodelsareapowerfulmethodofdimensionalityreductionforlargecollectionsofunstructureddocuments.Moreover,posteriorinferenceatthedocumentlevelisusefulforinfor-mationretrieval,classication,andtopic-directedbrows-ing.Treatingwordsexchangeablyisasimplicationthatitisconsistentwiththegoalofidentifyingthesemanticthemeswithineachdocument.Formanycollectionsofinterest,however,theimplicitassumptionofexchangeabledoc-umentsisinappropriate.Documentcollectionssuchasscholarlyjournals,email,newsarticles,andsearchquerylogsallreectevolvingcontent.Forexample,theSciencearticleTheBrainofProfessorLabordemaybeonthesamescienticpathasthearticleReshapingtheCorti-calMotorMapbyUnmaskingLatentIntracorticalConnec-tions,butthestudyofneurosciencelookedmuchdifferentin1903thanitdidin1991.Thethemesinadocumentcol-lectionevolveovertime,anditisofinteresttoexplicitlymodelthedynamicsoftheunderlyingtopics.Inthispaper,wedevelopadynamictopicmodelwhichcapturestheevolutionoftopicsinasequentiallyorganizedcorpusofdocuments.Wedemonstrateitsapplicabilitybyanalyzingover100yearsofOCR'edarticlesfromthejour-nalScience,whichwasfoundedin1880byThomasEdi-sonandhasbeenpublishedthroughthepresent.Underthismodel,articlesaregroupedbyyear,andeachyear'sarti-clesarisefromasetoftopicsthathaveevolvedfromthelastyear'stopics.Inthesubsequentsections,weextendclassicalstatespacemodelstospecifyastatisticalmodeloftopicevolution.Wethendevelopefcientapproximateposteriorinferencetechniquesfordeterminingtheevolvingtopicsfromase-quentialcollectionofdocuments.Finally,wepresentqual-itativeresultsthatdemonstratehowdynamictopicmodelsallowtheexplorationofalargedocumentcollectioninnew DynamicTopicModelsways,andquantitativeresultsthatdemonstrategreaterpre-dictiveaccuracywhencomparedwithstatictopicmodels.2.DynamicTopicModelsWhiletraditionaltimeseriesmodelinghasfocusedoncon-tinuousdata,topicmodelsaredesignedforcategoricaldata.Ourapproachistousestatespacemodelsonthenat-uralparameterspaceoftheunderlyingtopicmultinomials,aswellasonthenaturalparametersforthelogisticnor-maldistributionsusedformodelingthedocument-specictopicproportions.First,wereviewtheunderlyingstatisticalassumptionsofastatictopicmodel,suchaslatentDirichletallocation(LDA)(Bleietal.,2003).Let1:KbeKtopics,eachofwhichisadistributionoveraxedvocabulary.Inastatictopicmodel,eachdocumentisassumeddrawnfromthefollowinggenerativeprocess:1.Choosetopicproportionsfromadistributionoverthe(K1)-simplex,suchasaDirichlet.2.Foreachword:(a)ChooseatopicassignmentZMult().(b)ChooseawordWMult(z).Thisprocessimplicitlyassumesthatthedocumentsaredrawnexchangeablyfromthesamesetoftopics.Formanycollections,however,theorderofthedocumentsreectsanevolvingsetoftopics.Inadynamictopicmodel,wesupposethatthedataisdividedbytimeslice,forexamplebyyear.WemodelthedocumentsofeachslicewithaK-componenttopicmodel,wherethetopicsassociatedwithslicetevolvefromthetopicsassociatedwithslicet1.ForaK-componentmodelwithVterms,lett;kdenotetheV-vectorofnaturalparametersfortopickinslicet.Theusualrepresentationofamultinomialdistributionisbyitsmeanparameterization.Ifwedenotethemeanparam-eterofaV-dimensionalmultinomialby,theithcom-ponentofthenaturalparameterisgivenbythemappingi=log(i=V).Intypicallanguagemodelingapplica-tions,Dirichletdistributionsareusedtomodeluncertaintyaboutthedistributionsoverwords.However,theDirichletisnotamenabletosequentialmodeling.Instead,wechainthenaturalparametersofeachtopict;kinastatespacemodelthatevolveswithGaussiannoise;thesimplestver-sionofsuchamodelist;kjt 1;kN(t 1;k;2I):(1)OurapproachisthustomodelsequencesofcompositionalrandomvariablesbychainingGaussiandistributionsinadynamicmodelandmappingtheemittedvaluestothesim-plex.Thisisanextensionofthelogisticnormaldistribu-Figure1.Graphicalrepresentationofadynamictopicmodel(forthreetimeslices).Eachtopic'snaturalparameterst;kevolveovertime,togetherwiththemeanparameterstofthelogisticnormaldistributionforthetopicproportions.tion(Aitchison,1982)totime-seriessimplexdata(WestandHarrison,1997).InLDA,thedocument-specictopicproportionsaredrawnfromaDirichletdistribution.Inthedynamictopicmodel,weusealogisticnormalwithmeantoexpressuncertaintyoverproportions.Thesequentialstructurebe-tweenmodelsisagaincapturedwithasimpledynamicmodeltjt 1N(t 1;2I):(2)Forsimplicity,wedonotmodelthedynamicsoftopiccor-relation,aswasdoneforstaticmodelsbyBleiandLafferty(2006).Bychainingtogethertopicsandtopicproportiondistribu-tions,wehavesequentiallytiedacollectionoftopicmod-els.Thegenerativeprocessforslicetofasequentialcorpusisthusasfollows:1.Drawtopicstjt 1N(t 1;2I).2.Drawtjt 1N(t 1;2I).3.Foreachdocument:(a)DrawN(t;a2I)(b)Foreachword:i.DrawZMult(()).ii.DrawWt;d;nMult((t;z)).Notethatmapsthemultinomialnaturalparameterstothemeanparameters,(k;t)wexp(k;t;w)Pwexp(k;t;w):ThegraphicalmodelforthisgenerativeprocessisshowninFigure1.Whenthehorizontalarrowsareremoved,break-ingthetimedynamics,thegraphicalmodelreducestoasetofindependenttopicmodels.Withtimedynamics,thekth DynamicTopicModelstopicatslicethassmoothlyevolvedfromthekthtopicatslicet1.Forclarityofpresentation,wenowfocusonamodelwithKdynamictopicsevolvingasin(1),andwherethetopicproportionmodelisxedataDirichlet.Thetechnicalis-suesassociatedwithmodelingthetopicproportionsinatimeseriesasin(2)areessentiallythesameasthoseforchainingthetopicstogether.3.ApproximateInferenceWorkingwithtimeseriesoverthenaturalparametersen-ablestheuseofGaussianmodelsforthetimedynamics;however,duetothenonconjugacyoftheGaussianandmultinomialmodels,posteriorinferenceisintractable.Inthissection,wepresentavariationalmethodforapprox-imateposteriorinference.Weusevariationalmethodsasdeterministicalternativestostochasticsimulation,inor-dertohandlethelargedatasetstypicaloftextanalysis.WhileGibbssamplinghasbeeneffectivelyusedforstatictopicmodels(GrifthsandSteyvers,2004),nonconjugacymakessamplingmethodsmoredifcultforthisdynamicmodel.TheideabehindvariationalmethodsistooptimizethefreeparametersofadistributionoverthelatentvariablessothatthedistributioniscloseinKullback-Liebler(KL)diver-gencetothetrueposterior;thisdistributioncanthenbeusedasasubstituteforthetrueposterior.Inthedynamictopicmodel,thelatentvariablesarethetopicst;k,mixtureproportionst;d,andtopicindicatorszt;d;n.Thevariationaldistributionreectsthegroupstructureofthelatentvari-ables.Therearevariationalparametersforeachtopic'sse-quenceofmultinomialparameters,andvariationalparam-etersforeachofthedocument-levellatentvariables.TheapproximatevariationalposteriorisKYk=1q(k;1;:::;k;Tj^k;1;:::;^k;T)(3)TYt=1 DYd=1q(t;dj\rt;d)QNt;dn=1q(zt;d;njt;d;n)!:Inthecommonlyusedmean-eldapproximation,eachla-tentvariableisconsideredindependentlyoftheothers.Inthevariationaldistributionoffk;1;:::;k;Tg,however,weretainthesequentialstructureofthetopicbypositingadynamicmodelwithGaussianvariationalobservationsf^k;1;:::;^k;Tg.TheseparametersarettominimizetheKLdivergencebetweentheresultingposterior,whichisGaussian,andthetrueposterior,whichisnotGaussian.(AsimilartechniqueforGaussianprocessesisdescribedinSnelsonandGhahramani,2006.)Thevariationaldistributionofthedocument-levellatent!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Figure2.Agraphicalrepresentationofthevariationalapproxima-tionforthetimeseriestopicmodelofFigure1.ThevariationalparametersandarethoughtofastheoutputsofaKalmanlter,orasobserveddatainanonparametricregressionsetting.variablesfollowsthesameformasinBleietal.(2003).Eachproportionvectort;disendowedwithafreeDirichletparameter\rt;d,eachtopicindicatorzt;d;nisendowedwithafreemultinomialparametert;d;n,andoptimizationpro-ceedsbycoordinateascent.Theupdatesforthedocument-levelvariationalparametershaveaclosedform;weusetheconjugategradientmethodtooptimizethetopic-levelvariationalobservations.Theresultingvariationalapprox-imationforthenaturaltopicparametersfk;1;:::;k;Tgincorporatesthetimedynamics;wedescribeoneapprox-imationbasedonaKalmanlter,andasecondbasedonwaveletregression.3.1.VariationalKalmanFilteringTheviewofthevariationalparametersasoutputsisbasedonthesymmetrypropertiesoftheGaussiandensity,;(x)=x;(),whichenablestheuseofthestandardforward-backwardcalculationsforlinearstatespacemod-els.ThegraphicalmodelanditsvariationalapproximationareshowninFigure2.Herethetrianglesdenotevaria-tionalparameters;theycanbethoughtofashypotheticaloutputsoftheKalmanlter,tofacilitatecalculation.Toexplainthemainideabehindthistechniqueinasim-plersetting,considerthemodelwhereunigrammodelst(inthenaturalparameterization)evolveovertime.Inthismodeltherearenotopicsandthusnomixingparameters.Thecalculationsaresimplerversionsofthoseweneedforthemoregenerallatentvariablemodels,butexhibitthees- DynamicTopicModelssentialfeatures.Ourstatespacemodelistjt 1N(t 1;2I)wt;njtMult((t))andweformthevariationalstatespacemodelwhere^tjtN(t;^2tI)Thevariationalparametersare^tand^t.UsingstandardKalmanltercalculations(Kalman,1960),theforwardmeanandvarianceofthevariationalposterioraregivenbymt(tj^1:t)=^2tVt 12+^2tmt 11^2tVt 12+^2t^tVt((tmt)2j^1:t)^2tVt 12+^2t(Vt 12)withinitialconditionsspeciedbyxedm0andV0.Thebackwardrecursionthencalculatesthemarginalmeanandvarianceoftgiven^1:Tasmt 1(t 1j^1:T)=2Vt 12mt 112Vt 12mtVt 1((t 1mt 1)2j^1:T)Vt 1Vt 1Vt 122Vt(Vt 12)withinitialconditionsmTmTandVTVT.Weap-proximatetheposteriorp(1:Tjw1:T)usingthestatespaceposteriorq(1:Tj^1:T).FromJensen'sinequality,thelog-likelihoodisboundedfrombelowaslogp(d1:T)(4)q(1:Tj^1:T)log p(1:T)p(d1:Tj1:T)q(1:Tj^1:T)!d1:Tqlogp(1:T)+TXt=1qlogp(dtjt)+H(q)Detailsofoptimizingthisboundaregiveninanappendix.3.2.VariationalWaveletRegressionThevariationalKalmanltercanbereplacedwithvaria-tionalwaveletregression;forareadableintroductionstan-dardwaveletmethods,seeWasserman(2006).Werescaletimesoitisbetween0and1.For128yearsofSciencewetaken=2JandJ=7.Tobeconsistentwithourearliernotation,weassumethat^tmt+^twheretN(0;1).Ourvariationalwaveletregressionalgorithmestimatesf^tg,whichweviewasobserveddata,justasintheKalmanltermethod,aswellasthenoiselevel^.Forconcreteness,weillustratethetechniqueusingtheHaarwaveletbasis;Daubechieswaveletsareusedinouractualexamples.Themodelisthen^t(xt)+J 1Xj=02j 1Xk=0Djk jk(xt)wherextt=n,(x)=1for0x1, (x)=1if0x12;1if12x1and jk(x)=2j=2 (2jxk).Ourvariationalestimatefortheposteriormeanbecomesmt=^(xt)+J 1Xj=02j 1Xk=0^Djk jk(xt):where^n 1Pnt=1^t,and^Djkareobtainedbythresh-oldingthecoefcientsZjk1nnXt=1^t jk(xt):Toestimate^tweusegradientascent,asfortheKalmanlterapproximation,requiringthederivatives@mt=@^t.Ifsoftthresholdingisused,thenwehavethat@mt@^s@^@^s(xt)+J 1Xj=02j 1Xk=0@^Djk@^s jk(xt):with@^=@^sn 1and@^Djk=@^s(1n jk(xs)ifjZjkj-278;.237;0otherwise:NotealsothatjZjkj-278;.237;ifandonlyifj^Djkj0.Thesederivativescanbecomputedusingoff-the-shelfsoftwareforthewavelettransforminanyofthestandardwaveletbases.SampleresultsofrunningthisandtheKalmanvariationalalgorithmtoapproximateaunigrammodelaregiveninFigure3.Bothvariationalapproximationssmoothoutthe DynamicTopicModels18801900192019401960198020000.00000.00040.00080.001218801900192019401960198020000.00000.00040.00080.001218801900192019401960198020000e+002e-044e-046e-0418801900192019401960198020000e+002e-044e-046e-0418801900192019401960198020000e+002e-044e-046e-048e-041e-0318801900192019401960198020000e+002e-044e-046e-048e-041e-03Figure3.ComparisonoftheKalmanlter(top)andwaveletregression(bottom)variationalapproximationstoaunigrammodel.Thevariationalapproximations(redandbluecurves)smoothoutthelocaluctuationsintheunigramcounts(graycurves)ofthewordsshown,whilepreservingthesharppeaksthatmayindicateasignicantchangeofcontentinthejournal.ThewaveletregressionisabletosuperresolvethedoublespikesintheoccurrenceofEinsteininthe1920s.(ThespikeintheoccurrenceofDarwinnear1910maybeassociatedwiththecentennialofDarwin'sbirthin1809.)localuctuationsintheunigramcounts,whilepreservingthesharppeaksthatmayindicateasignicantchangeofcontentinthejournal.Whilethetissimilartothatob-tainedusingstandardwaveletregressiontothe(normal-ized)counts,theestimatesareobtainedbyminimizingtheKLdivergenceasinstandardvariationalapproximations.InthedynamictopicmodelofSection2,thealgorithmsareessentiallythesameasthosedescribedabove.How-ever,ratherthanttingtheobservationsfromtrueob-servedcounts,wetthemfromexpectedcountsunderthedocument-levelvariationaldistributionsin(3).4.AnalysisofScienceWeanalyzedasubsetof30,000articlesfromScience,250fromeachofthe120yearsbetween1881and1999.OurdatawerecollectedbyJSTOR(www.jstor.org),anot-for-protorganizationthatmaintainsanonlinescholarlyarchiveobtainedbyrunninganopticalcharacterrecogni-tion(OCR)engineovertheoriginalprintedjournals.JS-TORindexestheresultingtextandprovidesonlineaccesstothescannedimagesoftheoriginalcontentthroughkey-wordsearch.Ourcorpusismadeupofapproximately7.5millionwords.Weprunedthevocabularybystemmingeachtermtoitsroot,removingfunctionterms,andremovingtermsthatoc-curredfewerthan25times.Thetotalvocabularysizeis15,955.Toexplorethecorpusanditsthemes,weestimateda20-componentdynamictopicmodel.Posteriorinferencetookapproximately4hoursona1.5GHZPowerPCMac-intoshlaptop.TwooftheresultingtopicsareillustratedinFigure4,showingthetopseveralwordsfromthosetopicsineachdecade,accordingtotheposteriormeannumberofoccurrencesasestimatedusingtheKalmanltervariationalapproximation.Alsoshownareexamplearticleswhichex-hibitthosetopicsthroughthedecades.Asillustrated,themodelcapturesdifferentscienticthemes,andcanbeusedtoinspecttrendsofwordusagewithinthem.Tovalidatethedynamictopicmodelquantitatively,wecon-siderthetaskofpredictingthenextyearofSciencegivenallthearticlesfromthepreviousyears.Wecomparethepre-dictivepowerofthree20-topicmodels:thedynamictopicmodelestimatedfromallofthepreviousyears,astatictopicmodelestimatedfromallofthepreviousyears,andastatictopicmodelestimatedfromthesinglepreviousyear.Allthemodelsareestimatedtothesameconvergencecrite-rion.Thetopicmodelestimatedfromallthepreviousdataanddynamictopicmodelareinitializedatthesamepoint.Thedynamictopicmodelperformswell;italwaysassignshigherlikelihoodtothenextyear'sarticlesthantheothertwomodels(Figure5).Itisinterestingthatthepredictivepowerofeachofthemodelsdeclinesovertheyears.Wecantentativelyattributethistoanincreaseintherateofspecializationinscienticlanguage. DynamicTopicModels%% .0"7;==AEE;;?EA=?E=?DD?=?DE;==A=?E;;?E=?E@=?EA=E=?;==BD?EE@E@E;;?;==BEDD?=?E@=ADDH=B==A=?=ID=E@;==I=?=ADH=BEA=IEID=E=A=?D=EAD=DE=I=IEIE@=B=A=IEI=ID==?D=DEEA=IEI=A==?=IDGF=DE==AD=?=IEIGF=I=DED==A=DEGF=IEID=I=?=D=AD=GFDID?=IDID?=A=GFDD?E\]^_`abcd_`dZefgghjhgilgo}~ }}}~ }}~}}~ } ~ }~ ~ }~} ~ } }~ ~} ~ } ~ ~} ~ ~ } }~ }~ }~}} }~ili¤%% . 0 Figure4.Examplesfromtheposterioranalysisofa20-topicdynamicmodelestimatedfromtheSciencecorpus.Fortwotopics,weillustrate:(a)thetoptenwordsfromtheinferredposteriordistributionattenyearlags(b)theposteriorestimateofthefrequencyasafunctionofyearofseveralwordsfromthesametwotopics(c)examplearticlesthroughoutthecollectionwhichexhibitthesetopics.Notethattheplotsarescaledtogiveanideaoftheshapeofthetrajectoryofthewords'posteriorprobability(i.e.,comparisonsacrosswordsarenotmeaningful).5.DiscussionWehavedevelopedsequentialtopicmodelsfordiscretedatabyusingGaussiantimeseriesonthenaturalparam-etersofthemultinomialtopicsandlogisticnormaltopicproportionmodels.Wederivedvariationalinferencealgo-rithmsthatexploitexistingtechniquesforsequentialdata;wedemonstratedanoveluseofKalmanltersandwaveletregressionasvariationalapproximations.Dynamictopicmodelscangiveamoreaccuratepredictivemodel,andalsooffernewwaysofbrowsinglarge,unstructureddocumentcollections.Therearemanywaysthattheworkdescribedherecanbeextended.Onedirectionistousemoresophisticatedstatespacemodels.WehavedemonstratedtheuseofasimpleGaussianmodel,butitwouldbenaturaltoincludeadriftterminamoresophisticatedautoregressivemodeltoex-plicitlycapturetheriseandfallinpopularityofatopic,orintheuseofspecicterms.Anothervariantwouldallowforheteroscedastictimeseries.Perhapsthemostpromisingextensiontothemethodspre-sentedhereistoincorporateamodelofhownewtopicsinthecollectionappearordisappearovertime,ratherthanas-sumingaxednumberoftopics.OnepossibilityistouseasimpleGalton-Watsonorbirth-deathprocessforthetopicpopulation.Whiletheanalysisofbirth-deathorbranchingprocessesoftencentersonextinctionprobabilities,hereagoalwouldbetonddocumentsthatmayberesponsibleforspawningnewthemesinacollection. DynamicTopicModels192019401960198020001e+062e+064e+067e+06LDA-prevLDA-allDTMFigure5.Thisgureillustratestheperformanceofusingdy-namictopicmodelsandstatictopicmodelsforprediction.Foreachyearbetween1900and2000(at5yearincrements),wees-timatedthreemodelsonthearticlesthroughthatyear.Wethencomputedthevariationalboundonthenegativeloglikelihoodofnextyear'sarticlesundertheresultingmodel(lowernumbersarebetter).DTMisthedynamictopicmodel;LDA-previsastatictopicmodelestimatedonjustthepreviousyear'sarticles;LDA-allisastatictopicmodelestimatedonallthepreviousarticles.AcknowledgmentsThisresearchwassupportedinpartbyNSFgrantsIIS-0312814andIIS-0427206,theDARPACALOproject,andagrantfromGoogle.ReferencesAitchison,J.(1982).Thestatisticalanalysisofcomposi-tionaldata.JournaloftheRoyalStatisticalSociety,Se-riesB,44(2):139177.Blei,D.,Ng,A.,andJordan,M.(2003).LatentDirich-letallocation.JournalofMachineLearningResearch,3:9931022.Blei,D.M.andLafferty,J.D.(2006).Correlatedtopicmodels.InWeiss,Y.,Schölkopf,B.,andPlatt,J.,editors,AdvancesinNeuralInformationProcessingSystems18.MITPress,Cambridge,MA.Buntine,W.andJakulin,A.(2004).ApplyingdiscretePCAindataanalysis.InProceedingsofthe20thConferenceonUncertaintyinArticialIntelligence,pages5966.AUAIPress.Erosheva,E.(2002).Gradeofmembershipandlatentstructuremodelswithapplicationtodisabilitysurveydata.PhDthesis,CarnegieMellonUniversity,Depart-mentofStatistics.Fei-Fei,L.andPerona,P.(2005).ABayesianhierarchi-calmodelforlearningnaturalscenecategories.IEEEComputerVisionandPatternRecognition.Grifths,T.andSteyvers,M.(2004).Findingscientictopics.ProceedingsoftheNationalAcademyofScience,101:52285235.Kalman,R.(1960).Anewapproachtolinearlteringandpredictionproblems.TransactionoftheAMSE:JournalofBasicEngineering,82:3545.McCallum,A.,Corrada-Emmanuel,A.,andWang,X.(2004).Theauthor-recipient-topicmodelfortopicandrolediscoveryinsocialnetworks:ExperimentswithEn-ronandacademicemail.Technicalreport,UniversityofMassachusetts,Amherst.Pritchard,J.,Stephens,M.,andDonnelly,P.(2000).Infer-enceofpopulationstructureusingmultilocusgenotypedata.Genetics,155:945959.Rosen-Zvi,M.,Grifths,T.,Steyvers,M.,andSmith,P.(2004).Theauthor-topicmodelforauthorsanddocu-ments.InProceedingsofthe20thConferenceonUn-certaintyinArticialIntelligence,pages487494.AUAIPress.Sivic,J.,Rusell,B.,Efros,A.,Zisserman,A.,andFreeman,W.(2005).Discoveringobjectsandtheirlocationinim-ages.InInternationalConferenceonComputerVision(ICCV2005).Snelson,E.andGhahramani,Z.(2006).SparseGaussianprocessesusingpseudo-inputs.InWeiss,Y.,Schölkopf,B.,andPlatt,J.,editors,AdvancesinNeuralInformationProcessingSystems18,Cambridge,MA.MITPress.Wasserman,L.(2006).AllofNonparametricStatistics.Springer.West,M.andHarrison,J.(1997).BayesianForecastingandDynamicModels.Springer.A.DerivationofVariationalAlgorithmInthisappendixwegivesomedetailsofthevariationalalgorithmoutlinedinSection3.1,whichcalculatesadis-tributionq(1:Tj^1:T)tomaximizethelowerboundon DynamicTopicModelslogp(d1:T).Thersttermoftherighthandsideof(5)isTXt=1qlogp(tjt 1)=VT2 log2+log2122TXt=1q(tt 1)T(tt 1)VT2 log2+log2122TXt=1kmtmt 1k212TXt=1TrVt122Tr(V0)Tr(VT)usingtheGaussianquadraticformidentitym;V(x)T 1(x)=(m)T 1(m)+Tr( 1V):Thesecondtermof(5)isTXt=1qlogp(dtjt)=TXt=1Xwntwq twlogXwexp(tw)!TXt=1Xwntwmtwnt^ 1tXwexp(mtwVtw=2)TXt=1ntntlog^twherentPwntw,introducingadditionalvariationalparameters^1:T.Thethirdtermof(5)istheentropyH(q)=TXt=112logjVtjT2log212TXt=1XwlogVtwTV2log2:Tomaximizethelowerboundasafunctionofthevaria-tionalparametersweuseaconjugategradientalgorithm.First,wemaximizewithrespectto^;thederivativeis@`@^tnt^2tXwexp(mtwVtw=2)nt^t:Settingtozeroandsolvingfor^tgives^tXwexp(mtwVtw=2):Next,wemaximizewithrespectto^s:@`(^;^)@^sw12TXt=1(mtwmt 1;w)@mtw@^sw@mt 1;w@^swTXt=1ntwnt^ 1texp(mtwVtw=2)@mtw@^sw:Theforward-backwardequationsformtcanbeusedtode-rivearecurrencefor@mt=@^s.Theforwardrecurrenceis@mt@^s^2tvt 12+^2t@mt 1@^s1^2tvt 12+^2ts;t;withtheinitialcondition@m0=@^s=0.Thebackwardrecurrenceisthen@mt 1@^s2Vt 12@mt 1@^s12Vt 12@mt@^s;withtheinitialcondition@mT=@^s@mT=@^s.