/
Dynamic Topic Models David M Dynamic Topic Models David M

Dynamic Topic Models David M - PDF document

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
497 views
Uploaded On 2014-11-12

Dynamic Topic Models David M - PPT Presentation

Blei BLEI CS PRINCETON EDU Computer Science Department Princeton University Princ eton NJ 08544 USA John D Lafferty LAFFERTY CS CMU EDU School of Computer Science Carnegie Mellon University Pi ttsburgh PA 15213 USA Abs ID: 11109

Blei BLEI PRINCETON

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Dynamic Topic Models David M" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

DynamicTopicModelsDavidM.BleiBLEI@CS.PRINCETON.EDUComputerScienceDepartment,PrincetonUniversity,Princeton,NJ08544,USAJohnD.LaffertyLAFFERTY@CS.CMU.EDUSchoolofComputerScience,CarnegieMellonUniversity,PittsburghPA15213,USAAbstractAfamilyofprobabilistictimeseriesmodelsisdevelopedtoanalyzethetimeevolutionoftopicsinlargedocumentcollections.Theapproachistousestatespacemodelsonthenaturalparam-etersofthemultinomialdistributionsthatrepre-sentthetopics.VariationalapproximationsbasedonKalmanltersandnonparametricwaveletre-gressionaredevelopedtocarryoutapproximateposteriorinferenceoverthelatenttopics.Inaddi-tiontogivingquantitative,predictivemodelsofasequentialcorpus,dynamictopicmodelsprovideaqualitativewindowintothecontentsofalargedocumentcollection.Themodelsaredemon-stratedbyanalyzingtheOCR'edarchivesofthejournalSciencefrom1880through2000.1.IntroductionManagingtheexplosionofelectronicdocumentarchivesrequiresnewtoolsforautomaticallyorganizing,searching,indexing,andbrowsinglargecollections.Recentresearchinmachinelearningandstatisticshasdevelopednewtech-niquesforndingpatternsofwordsindocumentcollec-tionsusinghierarchicalprobabilisticmodels(Bleietal.,2003;McCallumetal.,2004;Rosen-Zvietal.,2004;Grif-thsandSteyvers,2004;BuntineandJakulin,2004;BleiandLafferty,2006).Thesemodelsarecalled“topicmod-els”becausethediscoveredpatternsoftenreecttheunder-lyingtopicswhichcombinedtoformthedocuments.Suchhierarchicalprobabilisticmodelsareeasilygeneralizedtootherkindsofdata;forexample,topicmodelshavebeenusedtoanalyzeimages(Fei-FeiandPerona,2005;Sivicetal.,2005),biologicaldata(Pritchardetal.,2000),andsurveydata(Erosheva,2002).Inanexchangeabletopicmodel,thewordsofeachdocu-AppearinginProceedingsofthe23rdInternationalConferenceonMachineLearning,Pittsburgh,PA,2006.Copyright2006bytheauthor(s)/owner(s).mentareassumedtobeindependentlydrawnfromamix-tureofmultinomials.Themixingproportionsarerandomlydrawnforeachdocument;themixturecomponents,ortop-ics,aresharedbyalldocuments.Thus,eachdocumentreectsthecomponentswithdifferentproportions.Thesemodelsareapowerfulmethodofdimensionalityreductionforlargecollectionsofunstructureddocuments.Moreover,posteriorinferenceatthedocumentlevelisusefulforinfor-mationretrieval,classication,andtopic-directedbrows-ing.Treatingwordsexchangeablyisasimplicationthatitisconsistentwiththegoalofidentifyingthesemanticthemeswithineachdocument.Formanycollectionsofinterest,however,theimplicitassumptionofexchangeabledoc-umentsisinappropriate.Documentcollectionssuchasscholarlyjournals,email,newsarticles,andsearchquerylogsallreectevolvingcontent.Forexample,theSciencearticle“TheBrainofProfessorLaborde”maybeonthesamescienticpathasthearticle“ReshapingtheCorti-calMotorMapbyUnmaskingLatentIntracorticalConnec-tions,”butthestudyofneurosciencelookedmuchdifferentin1903thanitdidin1991.Thethemesinadocumentcol-lectionevolveovertime,anditisofinteresttoexplicitlymodelthedynamicsoftheunderlyingtopics.Inthispaper,wedevelopadynamictopicmodelwhichcapturestheevolutionoftopicsinasequentiallyorganizedcorpusofdocuments.Wedemonstrateitsapplicabilitybyanalyzingover100yearsofOCR'edarticlesfromthejour-nalScience,whichwasfoundedin1880byThomasEdi-sonandhasbeenpublishedthroughthepresent.Underthismodel,articlesaregroupedbyyear,andeachyear'sarti-clesarisefromasetoftopicsthathaveevolvedfromthelastyear'stopics.Inthesubsequentsections,weextendclassicalstatespacemodelstospecifyastatisticalmodeloftopicevolution.Wethendevelopefcientapproximateposteriorinferencetechniquesfordeterminingtheevolvingtopicsfromase-quentialcollectionofdocuments.Finally,wepresentqual-itativeresultsthatdemonstratehowdynamictopicmodelsallowtheexplorationofalargedocumentcollectioninnew DynamicTopicModelsways,andquantitativeresultsthatdemonstrategreaterpre-dictiveaccuracywhencomparedwithstatictopicmodels.2.DynamicTopicModelsWhiletraditionaltimeseriesmodelinghasfocusedoncon-tinuousdata,topicmodelsaredesignedforcategoricaldata.Ourapproachistousestatespacemodelsonthenat-uralparameterspaceoftheunderlyingtopicmultinomials,aswellasonthenaturalparametersforthelogisticnor-maldistributionsusedformodelingthedocument-specictopicproportions.First,wereviewtheunderlyingstatisticalassumptionsofastatictopicmodel,suchaslatentDirichletallocation(LDA)(Bleietal.,2003).Let 1:KbeKtopics,eachofwhichisadistributionoveraxedvocabulary.Inastatictopicmodel,eachdocumentisassumeddrawnfromthefollowinggenerativeprocess:1.Choosetopicproportionsfromadistributionoverthe(K1)-simplex,suchasaDirichlet.2.Foreachword:(a)ChooseatopicassignmentZMult().(b)ChooseawordWMult( z).Thisprocessimplicitlyassumesthatthedocumentsaredrawnexchangeablyfromthesamesetoftopics.Formanycollections,however,theorderofthedocumentsreectsanevolvingsetoftopics.Inadynamictopicmodel,wesupposethatthedataisdividedbytimeslice,forexamplebyyear.WemodelthedocumentsofeachslicewithaK-componenttopicmodel,wherethetopicsassociatedwithslicetevolvefromthetopicsassociatedwithslicet1.ForaK-componentmodelwithVterms,let t;kdenotetheV-vectorofnaturalparametersfortopickinslicet.Theusualrepresentationofamultinomialdistributionisbyitsmeanparameterization.Ifwedenotethemeanparam-eterofaV-dimensionalmultinomialby,theithcom-ponentofthenaturalparameterisgivenbythemapping i=log(i=V).Intypicallanguagemodelingapplica-tions,Dirichletdistributionsareusedtomodeluncertaintyaboutthedistributionsoverwords.However,theDirichletisnotamenabletosequentialmodeling.Instead,wechainthenaturalparametersofeachtopic t;kinastatespacemodelthatevolveswithGaussiannoise;thesimplestver-sionofsuchamodelis t;kj t1;kN( t1;k;2I):(1)OurapproachisthustomodelsequencesofcompositionalrandomvariablesbychainingGaussiandistributionsinadynamicmodelandmappingtheemittedvaluestothesim-plex.Thisisanextensionofthelogisticnormaldistribu-Figure1.Graphicalrepresentationofadynamictopicmodel(forthreetimeslices).Eachtopic'snaturalparameterst;kevolveovertime,togetherwiththemeanparameters tofthelogisticnormaldistributionforthetopicproportions.tion(Aitchison,1982)totime-seriessimplexdata(WestandHarrison,1997).InLDA,thedocument-specictopicproportionsaredrawnfromaDirichletdistribution.Inthedynamictopicmodel,weusealogisticnormalwithmean toexpressuncertaintyoverproportions.Thesequentialstructurebe-tweenmodelsisagaincapturedwithasimpledynamicmodel tj t1N( t1;2I):(2)Forsimplicity,wedonotmodelthedynamicsoftopiccor-relation,aswasdoneforstaticmodelsbyBleiandLafferty(2006).Bychainingtogethertopicsandtopicproportiondistribu-tions,wehavesequentiallytiedacollectionoftopicmod-els.Thegenerativeprocessforslicetofasequentialcorpusisthusasfollows:1.Drawtopics tj t1N( t1;2I).2.Draw tj t1N( t1;2I).3.Foreachdocument:(a)DrawN( t;a2I)(b)Foreachword:i.DrawZMult(()).ii.DrawWt;d;nMult(( t;z)).Notethatmapsthemultinomialnaturalparameterstothemeanparameters,( k;t)wexp( k;t;w)Pwexp( k;t;w):ThegraphicalmodelforthisgenerativeprocessisshowninFigure1.Whenthehorizontalarrowsareremoved,break-ingthetimedynamics,thegraphicalmodelreducestoasetofindependenttopicmodels.Withtimedynamics,thekth DynamicTopicModelstopicatslicethassmoothlyevolvedfromthekthtopicatslicet1.Forclarityofpresentation,wenowfocusonamodelwithKdynamictopicsevolvingasin(1),andwherethetopicproportionmodelisxedataDirichlet.Thetechnicalis-suesassociatedwithmodelingthetopicproportionsinatimeseriesasin(2)areessentiallythesameasthoseforchainingthetopicstogether.3.ApproximateInferenceWorkingwithtimeseriesoverthenaturalparametersen-ablestheuseofGaussianmodelsforthetimedynamics;however,duetothenonconjugacyoftheGaussianandmultinomialmodels,posteriorinferenceisintractable.Inthissection,wepresentavariationalmethodforapprox-imateposteriorinference.Weusevariationalmethodsasdeterministicalternativestostochasticsimulation,inor-dertohandlethelargedatasetstypicaloftextanalysis.WhileGibbssamplinghasbeeneffectivelyusedforstatictopicmodels(GrifthsandSteyvers,2004),nonconjugacymakessamplingmethodsmoredifcultforthisdynamicmodel.TheideabehindvariationalmethodsistooptimizethefreeparametersofadistributionoverthelatentvariablessothatthedistributioniscloseinKullback-Liebler(KL)diver-gencetothetrueposterior;thisdistributioncanthenbeusedasasubstituteforthetrueposterior.Inthedynamictopicmodel,thelatentvariablesarethetopics t;k,mixtureproportionst;d,andtopicindicatorszt;d;n.Thevariationaldistributionreectsthegroupstructureofthelatentvari-ables.Therearevariationalparametersforeachtopic'sse-quenceofmultinomialparameters,andvariationalparam-etersforeachofthedocument-levellatentvariables.TheapproximatevariationalposteriorisKYk=1q( k;1;:::; k;Tj^ k;1;:::;^ k;T)(3)TYt=1 DYd=1q(t;dj\rt;d)QNt;dn=1q(zt;d;njt;d;n)!:Inthecommonlyusedmean-eldapproximation,eachla-tentvariableisconsideredindependentlyoftheothers.Inthevariationaldistributionoff k;1;:::; k;Tg,however,weretainthesequentialstructureofthetopicbypositingadynamicmodelwithGaussian“variationalobservations”f^ k;1;:::;^ k;Tg.TheseparametersarettominimizetheKLdivergencebetweentheresultingposterior,whichisGaussian,andthetrueposterior,whichisnotGaussian.(AsimilartechniqueforGaussianprocessesisdescribedinSnelsonandGhahramani,2006.)Thevariationaldistributionofthedocument-levellatent!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Figure2.Agraphicalrepresentationofthevariationalapproxima-tionforthetimeseriestopicmodelofFigure1.Thevariationalparametersand arethoughtofastheoutputsofaKalmanlter,orasobserveddatainanonparametricregressionsetting.variablesfollowsthesameformasinBleietal.(2003).Eachproportionvectort;disendowedwithafreeDirichletparameter\rt;d,eachtopicindicatorzt;d;nisendowedwithafreemultinomialparametert;d;n,andoptimizationpro-ceedsbycoordinateascent.Theupdatesforthedocument-levelvariationalparametershaveaclosedform;weusetheconjugategradientmethodtooptimizethetopic-levelvariationalobservations.Theresultingvariationalapprox-imationforthenaturaltopicparametersf k;1;:::; k;Tgincorporatesthetimedynamics;wedescribeoneapprox-imationbasedonaKalmanlter,andasecondbasedonwaveletregression.3.1.VariationalKalmanFilteringTheviewofthevariationalparametersasoutputsisbasedonthesymmetrypropertiesoftheGaussiandensity,;(x)=x;(),whichenablestheuseofthestandardforward-backwardcalculationsforlinearstatespacemod-els.ThegraphicalmodelanditsvariationalapproximationareshowninFigure2.Herethetrianglesdenotevaria-tionalparameters;theycanbethoughtofas“hypotheticaloutputs”oftheKalmanlter,tofacilitatecalculation.Toexplainthemainideabehindthistechniqueinasim-plersetting,considerthemodelwhereunigrammodels t(inthenaturalparameterization)evolveovertime.Inthismodeltherearenotopicsandthusnomixingparameters.Thecalculationsaresimplerversionsofthoseweneedforthemoregenerallatentvariablemodels,butexhibitthees- DynamicTopicModelssentialfeatures.Ourstatespacemodelis tj t1N( t1;2I)wt;nj tMult(( t))andweformthevariationalstatespacemodelwhere^ tj tN( t;^2tI)Thevariationalparametersare^ tand^t.UsingstandardKalmanltercalculations(Kalman,1960),theforwardmeanandvarianceofthevariationalposterioraregivenbymt( tj^ 1:t)=^2tVt12+^2tmt11^2tVt12+^2t^ tVt(( tmt)2j^ 1:t)^2tVt12+^2t(Vt12)withinitialconditionsspeciedbyxedm0andV0.Thebackwardrecursionthencalculatesthemarginalmeanandvarianceof tgiven^ 1:Tasmt1( t1j^ 1:T)=2Vt12mt112Vt12mtVt1(( t1mt1)2j^ 1:T)Vt1Vt1Vt122Vt(Vt12)withinitialconditionsmTmTandVTVT.Weap-proximatetheposteriorp( 1:Tjw1:T)usingthestatespaceposteriorq( 1:Tj^ 1:T).FromJensen'sinequality,thelog-likelihoodisboundedfrombelowaslogp(d1:T)(4)q( 1:Tj^ 1:T)log p( 1:T)p(d1:Tj 1:T)q( 1:Tj^ 1:T)!d 1:Tqlogp( 1:T)+TXt=1qlogp(dtj t)+H(q)Detailsofoptimizingthisboundaregiveninanappendix.3.2.VariationalWaveletRegressionThevariationalKalmanltercanbereplacedwithvaria-tionalwaveletregression;forareadableintroductionstan-dardwaveletmethods,seeWasserman(2006).Werescaletimesoitisbetween0and1.For128yearsofSciencewetaken=2JandJ=7.Tobeconsistentwithourearliernotation,weassumethat^ tmt+^twheretN(0;1).Ourvariationalwaveletregressionalgorithmestimatesf^ tg,whichweviewasobserveddata,justasintheKalmanltermethod,aswellasthenoiselevel^.Forconcreteness,weillustratethetechniqueusingtheHaarwaveletbasis;Daubechieswaveletsareusedinouractualexamples.Themodelisthen^ t (xt)+J1Xj=02j1Xk=0Djk jk(xt)wherextt=n,(x)=1for0x1, (x)=1if0x12;1if12x1and jk(x)=2j=2 (2jxk).Ourvariationalestimatefortheposteriormeanbecomesmt=^ (xt)+J1Xj=02j1Xk=0^Djk jk(xt):where^ n1Pnt=1^ t,and^Djkareobtainedbythresh-oldingthecoefcientsZjk1nnXt=1^ t jk(xt):Toestimate^ tweusegradientascent,asfortheKalmanlterapproximation,requiringthederivatives@mt=@^ t.Ifsoftthresholdingisused,thenwehavethat@mt@^ s@^ @^ s(xt)+J1Xj=02j1Xk=0@^Djk@^ s jk(xt):with@^ =@^ sn1and@^Djk=@^ s(1n jk(xs)ifjZjkj&#x-278;&#x.237;0otherwise:NotealsothatjZjkj&#x-278;&#x.237;ifandonlyifj^Djkj0.Thesederivativescanbecomputedusingoff-the-shelfsoftwareforthewavelettransforminanyofthestandardwaveletbases.SampleresultsofrunningthisandtheKalmanvariationalalgorithmtoapproximateaunigrammodelaregiveninFigure3.Bothvariationalapproximationssmoothoutthe DynamicTopicModels18801900192019401960198020000.00000.00040.00080.001218801900192019401960198020000.00000.00040.00080.001218801900192019401960198020000e+002e-044e-046e-0418801900192019401960198020000e+002e-044e-046e-0418801900192019401960198020000e+002e-044e-046e-048e-041e-0318801900192019401960198020000e+002e-044e-046e-048e-041e-03Figure3.ComparisonoftheKalmanlter(top)andwaveletregression(bottom)variationalapproximationstoaunigrammodel.Thevariationalapproximations(redandbluecurves)smoothoutthelocaluctuationsintheunigramcounts(graycurves)ofthewordsshown,whilepreservingthesharppeaksthatmayindicateasignicantchangeofcontentinthejournal.Thewaveletregressionisableto“superresolve”thedoublespikesintheoccurrenceofEinsteininthe1920s.(ThespikeintheoccurrenceofDarwinnear1910maybeassociatedwiththecentennialofDarwin'sbirthin1809.)localuctuationsintheunigramcounts,whilepreservingthesharppeaksthatmayindicateasignicantchangeofcontentinthejournal.Whilethetissimilartothatob-tainedusingstandardwaveletregressiontothe(normal-ized)counts,theestimatesareobtainedbyminimizingtheKLdivergenceasinstandardvariationalapproximations.InthedynamictopicmodelofSection2,thealgorithmsareessentiallythesameasthosedescribedabove.How-ever,ratherthanttingtheobservationsfromtrueob-servedcounts,wetthemfromexpectedcountsunderthedocument-levelvariationaldistributionsin(3).4.AnalysisofScienceWeanalyzedasubsetof30,000articlesfromScience,250fromeachofthe120yearsbetween1881and1999.OurdatawerecollectedbyJSTOR(www.jstor.org),anot-for-protorganizationthatmaintainsanonlinescholarlyarchiveobtainedbyrunninganopticalcharacterrecogni-tion(OCR)engineovertheoriginalprintedjournals.JS-TORindexestheresultingtextandprovidesonlineaccesstothescannedimagesoftheoriginalcontentthroughkey-wordsearch.Ourcorpusismadeupofapproximately7.5millionwords.Weprunedthevocabularybystemmingeachtermtoitsroot,removingfunctionterms,andremovingtermsthatoc-curredfewerthan25times.Thetotalvocabularysizeis15,955.Toexplorethecorpusanditsthemes,weestimateda20-componentdynamictopicmodel.Posteriorinferencetookapproximately4hoursona1.5GHZPowerPCMac-intoshlaptop.TwooftheresultingtopicsareillustratedinFigure4,showingthetopseveralwordsfromthosetopicsineachdecade,accordingtotheposteriormeannumberofoccurrencesasestimatedusingtheKalmanltervariationalapproximation.Alsoshownareexamplearticleswhichex-hibitthosetopicsthroughthedecades.Asillustrated,themodelcapturesdifferentscienticthemes,andcanbeusedtoinspecttrendsofwordusagewithinthem.Tovalidatethedynamictopicmodelquantitatively,wecon-siderthetaskofpredictingthenextyearofSciencegivenallthearticlesfromthepreviousyears.Wecomparethepre-dictivepowerofthree20-topicmodels:thedynamictopicmodelestimatedfromallofthepreviousyears,astatictopicmodelestimatedfromallofthepreviousyears,andastatictopicmodelestimatedfromthesinglepreviousyear.Allthemodelsareestimatedtothesameconvergencecrite-rion.Thetopicmodelestimatedfromallthepreviousdataanddynamictopicmodelareinitializedatthesamepoint.Thedynamictopicmodelperformswell;italwaysassignshigherlikelihoodtothenextyear'sarticlesthantheothertwomodels(Figure5).Itisinterestingthatthepredictivepowerofeachofthemodelsdeclinesovertheyears.Wecantentativelyattributethistoanincreaseintherateofspecializationinscienticlanguage. DynamicTopicModels%%  .  0" 7;=�=AEE;;?EA=?E=?DD?=?DE;=�=A=?E;;?E=?E@=?EA=E�=?;=�=BD?EE@E@E;;?;=�=BE�DD?=?E@=ADDH=B==A=?=ID=E@;=�=I=?=ADH=BEA=IEID=E�=A=?D=EAD=DE�=I=IEIE@=B=A=IEI=ID==?D=DE�EA=IEI=A==?=IDGF=DE�==AD=?=IEIGF=I=DE�D==A=DE�GF=IEID=I=?=D=AD=GFDID?=IDID?=A=GFDD?E�\]^_`abcd_`dZefgghjhgilgo}~€…}‡}‡}~€…}}~€}‡}~€…}‡……‹~…}~€…‹~…}~€}Š…‹~…}…}~€…~}Š…‹~……}…~…‹~}Š………~…‹‹…~……}Š…}~‹‹…}~€…‹‹…}~€}}…}~€‹‹‡ili¤ %% .           0  Figure4.Examplesfromtheposterioranalysisofa20-topicdynamicmodelestimatedfromtheSciencecorpus.Fortwotopics,weillustrate:(a)thetoptenwordsfromtheinferredposteriordistributionattenyearlags(b)theposteriorestimateofthefrequencyasafunctionofyearofseveralwordsfromthesametwotopics(c)examplearticlesthroughoutthecollectionwhichexhibitthesetopics.Notethattheplotsarescaledtogiveanideaoftheshapeofthetrajectoryofthewords'posteriorprobability(i.e.,comparisonsacrosswordsarenotmeaningful).5.DiscussionWehavedevelopedsequentialtopicmodelsfordiscretedatabyusingGaussiantimeseriesonthenaturalparam-etersofthemultinomialtopicsandlogisticnormaltopicproportionmodels.Wederivedvariationalinferencealgo-rithmsthatexploitexistingtechniquesforsequentialdata;wedemonstratedanoveluseofKalmanltersandwaveletregressionasvariationalapproximations.Dynamictopicmodelscangiveamoreaccuratepredictivemodel,andalsooffernewwaysofbrowsinglarge,unstructureddocumentcollections.Therearemanywaysthattheworkdescribedherecanbeextended.Onedirectionistousemoresophisticatedstatespacemodels.WehavedemonstratedtheuseofasimpleGaussianmodel,butitwouldbenaturaltoincludeadriftterminamoresophisticatedautoregressivemodeltoex-plicitlycapturetheriseandfallinpopularityofatopic,orintheuseofspecicterms.Anothervariantwouldallowforheteroscedastictimeseries.Perhapsthemostpromisingextensiontothemethodspre-sentedhereistoincorporateamodelofhownewtopicsinthecollectionappearordisappearovertime,ratherthanas-sumingaxednumberoftopics.OnepossibilityistouseasimpleGalton-Watsonorbirth-deathprocessforthetopicpopulation.Whiletheanalysisofbirth-deathorbranchingprocessesoftencentersonextinctionprobabilities,hereagoalwouldbetonddocumentsthatmayberesponsibleforspawningnewthemesinacollection. DynamicTopicModels192019401960198020001e+062e+064e+067e+06LDA-prevLDA-allDTMFigure5.Thisgureillustratestheperformanceofusingdy-namictopicmodelsandstatictopicmodelsforprediction.Foreachyearbetween1900and2000(at5yearincrements),wees-timatedthreemodelsonthearticlesthroughthatyear.Wethencomputedthevariationalboundonthenegativeloglikelihoodofnextyear'sarticlesundertheresultingmodel(lowernumbersarebetter).DTMisthedynamictopicmodel;LDA-previsastatictopicmodelestimatedonjustthepreviousyear'sarticles;LDA-allisastatictopicmodelestimatedonallthepreviousarticles.AcknowledgmentsThisresearchwassupportedinpartbyNSFgrantsIIS-0312814andIIS-0427206,theDARPACALOproject,andagrantfromGoogle.ReferencesAitchison,J.(1982).Thestatisticalanalysisofcomposi-tionaldata.JournaloftheRoyalStatisticalSociety,Se-riesB,44(2):139–177.Blei,D.,Ng,A.,andJordan,M.(2003).LatentDirich-letallocation.JournalofMachineLearningResearch,3:993–1022.Blei,D.M.andLafferty,J.D.(2006).Correlatedtopicmodels.InWeiss,Y.,Schölkopf,B.,andPlatt,J.,editors,AdvancesinNeuralInformationProcessingSystems18.MITPress,Cambridge,MA.Buntine,W.andJakulin,A.(2004).ApplyingdiscretePCAindataanalysis.InProceedingsofthe20thConferenceonUncertaintyinArticialIntelligence,pages59–66.AUAIPress.Erosheva,E.(2002).Gradeofmembershipandlatentstructuremodelswithapplicationtodisabilitysurveydata.PhDthesis,CarnegieMellonUniversity,Depart-mentofStatistics.Fei-Fei,L.andPerona,P.(2005).ABayesianhierarchi-calmodelforlearningnaturalscenecategories.IEEEComputerVisionandPatternRecognition.Grifths,T.andSteyvers,M.(2004).Findingscientictopics.ProceedingsoftheNationalAcademyofScience,101:5228–5235.Kalman,R.(1960).Anewapproachtolinearlteringandpredictionproblems.TransactionoftheAMSE:JournalofBasicEngineering,82:35–45.McCallum,A.,Corrada-Emmanuel,A.,andWang,X.(2004).Theauthor-recipient-topicmodelfortopicandrolediscoveryinsocialnetworks:ExperimentswithEn-ronandacademicemail.Technicalreport,UniversityofMassachusetts,Amherst.Pritchard,J.,Stephens,M.,andDonnelly,P.(2000).Infer-enceofpopulationstructureusingmultilocusgenotypedata.Genetics,155:945–959.Rosen-Zvi,M.,Grifths,T.,Steyvers,M.,andSmith,P.(2004).Theauthor-topicmodelforauthorsanddocu-ments.InProceedingsofthe20thConferenceonUn-certaintyinArticialIntelligence,pages487–494.AUAIPress.Sivic,J.,Rusell,B.,Efros,A.,Zisserman,A.,andFreeman,W.(2005).Discoveringobjectsandtheirlocationinim-ages.InInternationalConferenceonComputerVision(ICCV2005).Snelson,E.andGhahramani,Z.(2006).SparseGaussianprocessesusingpseudo-inputs.InWeiss,Y.,Schölkopf,B.,andPlatt,J.,editors,AdvancesinNeuralInformationProcessingSystems18,Cambridge,MA.MITPress.Wasserman,L.(2006).AllofNonparametricStatistics.Springer.West,M.andHarrison,J.(1997).BayesianForecastingandDynamicModels.Springer.A.DerivationofVariationalAlgorithmInthisappendixwegivesomedetailsofthevariationalalgorithmoutlinedinSection3.1,whichcalculatesadis-tributionq( 1:Tj^ 1:T)tomaximizethelowerboundon DynamicTopicModelslogp(d1:T).Thersttermoftherighthandsideof(5)isTXt=1qlogp( tj t1)=VT2log2+log2122TXt=1q( t t1)T( t t1)VT2log2+log2122TXt=1kmtmt1k212TXt=1TrVt122Tr(V0)Tr(VT)usingtheGaussianquadraticformidentitym;V(x)T1(x)=(m)T1(m)+Tr(1V):Thesecondtermof(5)isTXt=1qlogp(dtj t)=TXt=1Xwntwq twlogXwexp( tw)!TXt=1Xwntwmtwnt^1tXwexp(mtwVtw=2)TXt=1ntntlog^twherentPwntw,introducingadditionalvariationalparameters^1:T.Thethirdtermof(5)istheentropyH(q)=TXt=112logjVtjT2log212TXt=1XwlogVtwTV2log2:Tomaximizethelowerboundasafunctionofthevaria-tionalparametersweuseaconjugategradientalgorithm.First,wemaximizewithrespectto^;thederivativeis@`@^tnt^2tXwexp(mtwVtw=2)nt^t:Settingtozeroandsolvingfor^tgives^tXwexp(mtwVtw=2):Next,wemaximizewithrespectto^ s:@`(^ ;^)@^ sw12TXt=1(mtwmt1;w)@mtw@^ sw@mt1;w@^ swTXt=1ntwnt^1texp(mtwVtw=2)@mtw@^ sw:Theforward-backwardequationsformtcanbeusedtode-rivearecurrencefor@mt=@^ s.Theforwardrecurrenceis@mt@^ s^2tvt12+^2t@mt1@^ s1^2tvt12+^2ts;t;withtheinitialcondition@m0=@^ s=0.Thebackwardrecurrenceisthen@mt1@^ s2Vt12@mt1@^ s12Vt12@mt@^ s;withtheinitialcondition@mT=@^ s@mT=@^ s.