berkeleyedu Minos Garofalakis Bell Laboratories 600 Mountain Avenue Murray Hill NJ 07974 minosresearchbelllabscom Michael I Jordan Computer Science Statistics University of California Berkeley CA 94720 jordancsberkeleyedu Abstract In this paper we p ID: 74384
Download Pdf The PPT/PDF document "Ecient Stepwise Selection in Decomposabl..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
EcientStepwiseSelectioninDecomposableModels AmolDeshpandeComputerScienceDepartmentUniversityofCaliforniaBerkeley,CA94720amol@cs.berkeley.eduMinosGarofalakisBellLaboratories600MountainAvenueMurrayHill,NJ07974minos@research.bell-labs.comMichaelI.JordanComputerScience&StatisticsUniversityofCaliforniaBerkeley,CA94720jordan@cs.berkeley.eduAbstractInthispaper,wepresentanecientalgo-rithmforperformingstepwiseselectionintheclassofdecomposablemodels.Wefo-cusontheforwardselectionprocedure,butwealsodiscusshowbackwardselectionandthecombinationofthetwocanbeperformedeciently.Themaincontributionsofthispaperare(1)asimplecharacterizationfortheedgesthatcanbeaddedtoadecompos-ablemodelwhileretainingitsdecomposabil-ityand(2)anecientalgorithmforenumer-atingallsuchedgesforagivendecomposablemodelin)time,whereisthenumberofvariablesinthemodel.Wealsoanalyzethecomplexityoftheoverallstepwiseselectionprocedure(whichincludesthecomplexityofenumeratingeligibleedgesaswellasthecomplexityofdecidinghowto\progress").WeusetheKLdivergenceofthemodelfromthesaturatedmodelasourmetric,buttheresultswepresenthereextendtomanyothermetricsaswell.1IntroductionUndirectedgraphicalmodelshavebecomeincreasinglypopularinareassuchasinformationretrieval,statis-ticalnaturallanguageprocessing,andvision,wheretheyareoftenreferredtoasmaximumentropymodels,andareviewedashavingvariousrepre-sentationalandstatisticaladvantages.Newtoolsformodelselectionandparameterestimationarebeingdevelopedbyresearchersintheseareas[PPL97,Hin99, Partoftheworkwasdonewhiletheauthorwasvis-itingBellLaboratories.ZWM97].Generalundirectedmodels,however,havesomeseriousdisadvantages,inparticulartheyrequireaninvocationofIterativeProportionalFitting(orre-latediterativealgorithms)tondmaximumlikelihoodestimates,eveninthecaseoffully-observedgraphs.Astheinnerloopofmoregeneralparameterestimationormodelselectionprocedures(e.g.,theMstepofanEMalgorithm),theseiterativealgorithmscanimposeseriousbottlenecks.Decomposablemodelsarearestrictedfamilyofundi-rectedgraphicalmodelsthathaveanumberofap-pealingfeatures:(1)maximumlikelihoodestimatescanbecalculatedanalyticallyfrommarginalproba-bilities,obviatingtheneedforIterativeProportionalFitting,(2)closedformexpressionsforteststatisticscanbefound,and(3)thereareseveralusefullinkstodirectedmodels(everydecomposablemodelhasarepresentationaseitheranundirectedoradirectedmodel),inferencealgorithms(decomposablemodelsareequivalenttotriangulatedgraphs),andacyclicdatabaseschemes[BFMY83].Decomposablemodelswouldthereforeseemtoprovideausefullyconstrainedrepresentationinwhichmodelselectionandparame-terestimationmethodscanbedeployedinlarge-scaleproblems.Moreover,mixturesofdecomposablemod-elsprovideanaturalupgradepathiftherepresenta-tionalstricturesofdecomposablemodelsareconsid-eredtoosevere.Althoughdecomposablemodelsareasubclassofundi-rectedgraphicalmodels,theproblemofndingtheoptimaldecomposablemodelforagivendatasam-pleisknowntobeintractableandheuristicsearchtechniquesaregenerallyused[Edw95].Mostproce-duresarebasedonsomecombinationof(i)forwardselection,inwhichwestartwithasmallmodelandaddedgesaslongasanappropriatescorefunctionincreases[Hec98],and(ii)backwardselection,where startingwithalargermodel,edgesaredeletedfromthemodel.Sincetheinterveningmodelsencounteredinthesearchmustalsobedecomposable,caremustbetakensuchthatdeletionoradditionofedgesdoesnotresultinanon-decomposablemodel.Backwardselectionproceduresfordecomposablemodelsarewellknownintheliterature[Wer76,Lau96],butecientforwardselectionprocedureshavenotyetbeendevel-oped.Oneofthegoalsofthecurrentpaperistollthisgap.Thispaperisatheoreticalpaperthatmakestwomaincontributions.First,weprovideasimplecharacteriza-tionoftheedgesthatcanbeaddedtoadecomposablemodel(or,equivalently,thechordalgraphcorrespond-ingtothemodel)whileresultinginanotherdecom-posablemodel.Second,basedonthischaracterizationwepresentanecientalgorithmforenumeratingallsuchedgesforthecurrentmodelin)time,whereisthenumberofattributes.Weprovideacarefulanalysisoftherunningtimecomplexityoftheoverallforwardselectionprocedure,includingthetimetakenforchoosingwhichoftheeligibleedgestoaddtothecurrentmodel.WeusetheminimizationofKLdiver-genceasourmetric,buttheresultswepresentcanbeextendedtoanyotherlocallycomputablemetric(e.g.,[GG99]).Thoughourmainfocusisthenewforwardselectionprocedure,wealsoshowthatthealgorithmsareeasilyextendedtobackwardselectionortoacombinationofforwardandbackwardselection.Thetechniquesanddatastructuresweproposealsonaturallyextendtotheproblemofndingstronglydecomposablemodelsinmixedgraphs.Theremainderofthepaperisorganizedasfollows.InSection2,wederiveasimplecharacterizationfortheedgesthatcanbeaddedtoachordalgraphwhilemaintainingitschordality.InSection3,wedescribethedatastructurethatweuseforndingsuchedgesecientlyanddiscusshowitismaintainedinthepres-enceofadditionstotheunderlyingmodelgraph.InSection4,weanalyzetheoverallcomplexityofthestepwiseselectionproceduresfortheKLdivergencemetric.Webrie ydiscusshowthedatastructurescanbeextendedfordoingbackwardselectionasmightberequiredforaprocedurethatalternatesbetweenfor-wardandbackwardselectioninSection5.InSection6,wediscusshowthesealgorithmscanbeextendedforthecaseofmixedgraphsandstrongdecomposability.WeconcludewithSection7. b va Figure1:Structureofthesubgraphinducedbyand2CharacterizingEdgesEligibleforStepwiseSelectionThereisaclassicalcharacterizationoftheedgesthatcanbedeletedfromachordalgraphsuchthattheresultinggraphremainschordal:Theorem2.1[Wer76,Lau96]Givenachordalgraph,anedgecanbedeletedfromthegraphwhilemain-tainingitschordalityitheedgebelongstoexactlyoneofthemaximalcliquesofthegraph.Tocomplementthisresult,weproposethefollow-ingcharacterizationoftheedgesthatcanaddedachordalgraphwithoutviolatingitschordality(theproofisgiveninAppendixA):Theorem2.2GivenachordalgraphV;E),anedge(canbeaddedtothegraphwhilemaintainingitschordalityiitsatisesthefollowingproperties:1.Thereexistsasubsetofnodes,suchthat(andareconnectedtoallverticesin2.Thesetistheminimalseparatorforand(notethat,sinceischordal,thisimpliesthatisaclique);,removingthenodesinandalltheirincidentedgesfromseparatesandandnopropersubsetofhasthisproperty.3EnumeratingEligibleEdgesforForwardSelectionInthissection,wedescribehowtoenumeratealledgesthatcanbeaddedtothecurrentchordalgraphwhilemaintainingchordality.Forthispurpose,we Forsimplicity,weusethetermsdecomposablemodelandchordalgraphinterchangeably. maintaintwoauxiliarydatastructures(i)acliquegraph[GHP95]correspondingtothecurrentchordalgraph,(ii)abooleanmatrix,,indexedbytheattributesofdatawhichmaintainsinformationaboutwhetherapairofnodesiseligibleforaddition.Clearly,,wecanenumeratetheedgeseligibleforfor-wardselectionin)time,whereisthenumberofattributes.Thecliquegraphisrequiredtoupdate)timewhenanedgeisaddedtotheunder-lyinggraph,aswewillseelaterinthesection.3.1CliqueGraph3.1.1DenitionandPropertiesDenition:cliquegraphofachordalgraphV;E),denotedby,hasthemaximalcliquesofthechordalgraphasitsvertices,andhastheprop-ertythatgivenanytwomaximalcliques,andthereisanedgebetweenthesetwocliquesiseparatesthenodesetsandNotethatthisusageoftheterm\cliquegraph"isnon-standard;wearefollowingtheterminologyof[GHP95].Wewillnotesomepropertiesofcliquegraphs.Lemma3.1Twonodeswith(satisfytheforwardselectioncharacterization(Section2.2)ithereexistsanedge()insuchthatandLemma3.2[GHP95]Amaximumspanningtreeofacliquegraphofachordalgraph,wheretheweightofanedgeisthesizeoftheintersectionofthetwocliquesitjoins,isajunctiontreeofthechordalgraph.Lemma3.3ThenumberofnodesinacliquegraphofachordalgraphV;E)isatmost3.2UpdatingtheCliqueGraphAfterallpossibleedgesareenumeratedandoneedgeischosenbasedonamodelselectioncriterion,weneedtoupdatethecliquegraphtore ecttheadditionofthisnewedgetotheunderlyingchordalgraph.Inthissection,weshowhowthiscanbedonein)time. NotealsothatthecliquegraphofachordalgraphisequivalenttoanAlmondJunctionTree[AK93,JJ94]ofthegraph;bothofthesestructurescanbeseenascompactrepresentationsofallpossiblejunctiontreesofachordalgraph.3.2.1UpdateAlgorithmV;E)betheoriginalchordalgraphandletbethecliquegraph.Letthenewedgethatisaddedbe().Also,let()bethecorrespondingedgeinthecliquegraph(theedgefromwhichthispairofnodeswasobtained)andletFinally,letandbethenewmodelandcliquegraphs.Additionoftheedge()createsanewmaximumasshowninFigure2.Itispossiblethat(or),inwhich)willnotbeamaximalcliqueinthenewchordalgraph.Weassumefornowthatthisdoesnothappen.Inthatcase,addingtheedge()toresultsinapartialcliquegraphstructureasshowninFigure2.Notethatthenewcliquewillbeconnectedand Figure2:PartialStructureoftheCliqueGraphafteraddingtheedge(Afterthisnodehasbeencreatedandaddedtothecliquegraph,theupdatealgorithmhastodeletethoseedgesfromthecliquegraphwhichdonotsatisfythecliquegraphproperty(Section3.1.1).Itiseasytoseethatnonewedgesbetweenalreadyexistingmaximalcliquesinthecliquegraphwillhavetobeadded.Giventhis,theupdatealgorithmisasfollows(seeAppendixBforthecorrectnessproof):1.Let.Findallnodesthatarecon-nectedtoandtoandmaintainthisinformationintwoarraysindexedbytheverticesDecidingwhethertokeepanedge:Let.If,thenkeepthisedge.Otherwise,considerthegraph).Ifisconnectedtointhisgraphandisconnectedtovice ,donotkeepthisedgein.Otherwise,keepit.Thischeckcanbeperformedin(1)timeusingthearraysconstructedinStep1.Addingedgesinvolving1.Foreverymaximalcliquesuchthat(afterexecutionoftheabovestep),addanedge()toif(2.Foreverymaximalcliquesuchthat,explicitlycheckifthisedgeshouldbeaddedtothecliquegraph.Thisre-quirescheckingwhetherisseparatedfrom,whichcanbedonebylookingupthearrayscomputedinStep1.3.Repeattheabovetwostepsforandinsteadofand4.Remove)ifitiscontainedin V;E,currentmodel&cliquegraphsV;E,newmodel&cliquegraphsbegin1.using,choosewhichoftheedgestoaddto2.let()bethenewedgetobeadded,letbethecorrespondingedgeinandlet G.FindallnodesconnectedtoandtoandmaintainthisinformationintwoarraysindexedbytheverticesofV;E6.delete()from,addanewnodeandaddedges()and(foralledges(equaltoisconnectedtoandisconnectedtoorviceversa10.deletetheedge()fromforalls.t.(14.addtheedge()toforalls.t.16.checkifseparatesfrom17.ifyes,addtheedge()to18.repeatSteps11-17forandinsteadofand19.remove(similarly,)from20.updateasdescribedinSection3.3Figure3:CompleteForwardSelectionAlgorithm 3.2.2ComplexityAnalysisItiseasytoseethatSteps1and3canbeperformed)time,butcomparisonoftwosetstocheckiftheyareequalis)ingeneral.Assuch,thecom-plexityofStep2mightseemtobe.Toperformthisstepeciently,wemaintainabinaryindexontheminimalseparatorsofthecurrentmodelgraph.Aleafofthisindexcontainsalistofpointerstoalltheedgesthathaveassociatedwiththemtheminimalseparatorcorrespondingtotheleaf.Nowgiven,wecanndalltheedgeswithseparatorequaltointime(sincetheheightoftheindexis).AswementionedinStep2,weonlyneedtochecksuchedgessincetherestoftheedgesremainuntouched.Thedeletions,ifrequired,canbeperformedin(1)timeperdele-tion.Thus,giventhisdatastructure,Step2canbeperformedintime)(actually,inordertime,whereisthenumberofedgesinseparator).Insertionsinthisindexcanbedonein)time,byaddingapointertothenewedgeatthefrontofthelistofpointers.SincetherecanbeatmostinsertionstothisindexinStep3,thecomplexityofStep3remains3.3Updatingisupdatedasfollows:1.Settheentry()tofalse,sincethisedgeisnoteligibleforadditionanymore.2.Sincesomeedgesaredeletedfromthecliquegraph,thecorrespondingpairsofnodesthatmighthavebeeneligibleforadditionbeforemaynotbeeligibleanymore.Notingthatallsuchdeletedcliquegraphedgesmusthavethesepara-tor,wesettheentrycorrespondingtopairs)suchthatisconnectedtoandconnectedto,tofalse.Thiscanbedonebyusingthearrayscomputedduringthecliquegraphupdatealgorithmin)time.3.Someedgesoftheform()or()maybenoweligibleforaddition.Tosetthecorrespondingentriestotrue,foreveryedgethatisaddedtothecliquegraph(),settotruetheentriescorrespondingto(.Similarlyfor(pleaserefertoTheorem4.3fortheproofthattheseincludealltheedgesnewly Inmanycases,modelsearchisperformedamongmod-elswitharestrictedmaximumcliquesize.Inthatcase,setcomparisonstakeconstanttimeandthedatastructurewedescribenextisnotrequired. eligibleforaddition.).Thus,canalsobeupdatedin)time.4ComplexityofOverallForwardSelectionProcedureTheoverallforwardselectionprocedurehastwoparts:1.Enumeratingalledgesthatcanbeaddedtothecurrentmodel.2.Decidingwhichoftheeligibleedgestoaddtothemodelfor\forwardprogress".Thusfar,wehavepresentedan)solutiontotherstproblem,givingusanoverallcomplexityof),ifatotalofedgesareaddedstartingwiththenullmodel.Inthissection,weanalyzethecom-plexityofthesecondstepwiththeaimofminimiz-ingKLdivergencefromthesaturatedmodel,whichisequivalenttomaximumlikelihoodestimation.Theorem4.1[Mal91]MinimizingtheKLdivergenceofagivenmodelfromthesaturatedmodelisequiv-alenttominimizingtheentropyofthemodelwhichisdenedas:)=whereisthesetofmaximumcliquesofthemodel,andisthesetofseparatorsforajunctiontreeofthemodel.Theentropyofasubsetofattributes,;:::;A),isdenedtobe:)= log where)denotesthedomainofattributedenotesthenumberofdataitemswhichcontainandisthetotalnumberofdatapoints.Thus,theonlyinformationweneedfromadecompos-ablemodel,tocomputeitsdivergencefromthesatu-ratedmodel,isalistofitsmaximalcliquesandalistofseparatorsinsomejunctiontreeforthemodel.Manymetricsforgraphicalmodelshaveasimilarpropertyandthispropertycanbeusedtoavoidrecomputationofajunctiontreeforthenewmodelobtainedafteranincrementalchangetothecurrentmodel:Theorem4.2IftwodecomposablemodelsMMdieronlyinoneedge(),(i.e.,(and),thenthemaximalcliquesandthemin-imalseparatorsinjunctiontreescorrespondingtothetwomodels,)andwherearethelistsofmaximalcliquesandarethelistsofseparators,dierasfollows:1.Ifand,thenand2.Ifand,thenand3.Ifand,thenand4.Ifand,thenandThisimmediatelygivesusthefollowingcorollary:Corollary4.1IftwodecomposablemodelsMMdieronlyinoneedge(),(i.e.,(and),thenwhereistheminimalseparatorofandThusthechangeintheentropyofthemodelaf-teradding(ordeleting)anedgeisonlydependentontheminimalseparatorofthetwoverticesinthemodelgraphandassuch,thisinformationcanbepre-computedforallpairsofnodesthatareeligibleforforwardselection.Duringtheforwardselectionproce-dure,these\changes"canbeassociatedwiththecor-respondingedgeinthecliquegraph.Thisalsomeansthatafteraddinganewedgetothemodelgraph,weonlyhavetocomputenewentropiescorrespondingtothenewedgesthatareaddedtothecliquegraph.Inthenexttheorem,weboundthetotalnumberofen-tropiesthathavetocomputedafteradditionofanedgetothemodelgraph:Theorem4.3Thenumberofnewentropiesthatneedtobecomputedafteraddingtheedge(totheunderlyingmodelgraph(afterperformingoneforwardselectionstep)isatmostto2(),wherearethenumberofneighborsofandrespectively.Proof:Anynewedgethatisaddedtothecliquegraphmusthaveasoneofitsendpoints.Consideronesuchedge,(),withassociatedseparator.Thisedgemustsatisfyoneofthefollowingproperties(cf.TheoremB.3)::Inthiscase,thepairsofnodesare(possiblynewly)eligi-bleforforwardselection.Eachsuchedgerequirescomputationof)and).Outofthese,and)havealreadybeencomputedandweonlyhavetocompute)and :Theedge()belongstoandalsoto.Thisimpliesthatpairsofnodes(wereeligibleforadditioninaswellandassuch,theentropiesneededforthesepairshavealreadybeencomputed.Thus,theonlynewentropiesthatneedtobecomputedareforpairsofnodes,andagainweonlyneedtocompute)and:Theanalysisforthesetwocasesissimilar.Thus,theonlynewentropiesthatneedtobecomputedarethosecorrespondingtothepairsofnodesoftheform()or()andweneedtocomputeatmosttwoentropiesforeverysuchpair.Sincethereareatmostsuchpairs,thetotalnumberofnewentropiesthatneedtobecomputedisatmost)+2( Notethatthistheoremassumesthatalltheentropiesrequiredforforwardselectioninthecurrentmodelarealreadycomputed,andhenceitdoesnotapplytotheveryrststep.Forexample,ifwestartwiththenullmodel(emptymodelgraph),thenweneedtocomputeentropiestodecidewhichedgetoaddintherststep.5BackwardSelectionInthissection,webrie youtlinehowtoextendourdatastructuresfordoingbackwardselectionasmightberequiredforaprocedurewhichalternatesbetweenforwardselectionandbackwardselection.Detailsaredeferredtothefullversionofthepaper.Forecientenumerationofedgeseligiblefordeletionandupdateofthecliquegraphdatastructures,weneedtomaketwochanges:Amatrixofsize,indexedbytheverticesofthemodel,ismaintainedforenumerationofedgeseligiblefordeletion.Theentryinthematrixcorrespondingtoapairofnodes(u;v)tellsuswhethertheedgeispresentinthemodel,andifyes,whetheritiseligiblefordeletion.Thebinaryindexontheseparatorsisaugmentedtokeeptheintersectionsetsforeverypossiblepairofmaximalcliques.Thisinformationisrequiredto\re-insert"thoseedgesinthecliquegraphthatmighthavebeendeletedinthereversestepofaddingthisedgetothegraph.Thealgorithmsforcliquegraphupdatedescribedear-lierhavetobemodiedtomaintainthesedatastruc-turesaswell,buttheseupdatescanalsobeperformed)timeduringbothbackwardselectionandfor-wardselection.Also,itcanbeprovedthat:Theorem5.1Thenumberofnewentropiesthatneedtobecomputedafterdeletinganedgefromtheunderlyingmodelgraph(afterperformingonebackwardselectionstep)isatmost(1)andthosearealloftheform),for6MixedGraphsTheresultsinthispaperextendreadilytothecaseofmixedgraphsandstrongdecomposability,viathefollowingtheorem:Theorem6.1[Lau96]Givenastronglydecompos-ablemixedgraphV;E),withwithVc,whereisthesubsetofverticescorrespondingtothecon-tinuousvariablesandisthesubsetofverticescor-respondingtothediscretevariables,thenthegraph),with,ischordal.7ConclusionsDecomposablemodelspossessseveralimportantchar-acteristicsthatmakethemanappealingclassofstatis-ticalmodels,ashasbeenobservedinappliedcontextsrangingfromwordsensedisambiguation[BW94]tomulti-dimensionalhistograms[DGR01].Eciental-gorithms,however,areessentialifthisclassofmodelsistobeexploitedinlarge-scaleproblems.Inthispa-per,wehavepresentedanecientnewalgorithmforperformingstepwiseselectionindecomposablemodels.Theenumerationofedgeseligibleforforwardorback-wardselectioncanbeaseriouspracticalbottleneckindecomposablemodels,andthenewalgorithmisasig-nicantimprovementoverthenaiveproceduresthatarecurrentlyused(cf.[Edw95]).Togetherwithmeth-odsthatallowrapidcomputationofsucientstatis-tics,suchasthecubecomputationtechniquesdevel-opedinthedatabaseliterature[A96]andtechniquesthatexploitdatasparseness[MJ00],wefeelthatthesealgorithmsallowdecomposablemodels(andtheirex-tensiontomixturesofdecomposablemodels)tobe-comeanincreasinglyviablealternativeinlarge-scaleexploratorydataanalysisproblems. AcknowledgementsThisworkwassupportedbyCONTROL442427-21389,ONRMURIN00014-00-1{637andNSFgrantIIS-9988642.Referenceses+96]SameetAgarwaletal.Onthecomputationofmultidimensionalaggregates.In,1996.[AK93]R.AlmondandA.Kong.OptimalityissuesinconstructingaMarkovtreefromgraphicalmodels.ComputationalandGraphicalStatis-tics,1993.[BFMY83]C.Beeri,R.Fagin,D.Maier,andM.Yan-nakakis.Onthedesirabilityofacyclicdatabaseschemes.JournaloftheACM,1983.[BW94]R.BruceandJ.Wiebe.Word-sensedisam-biguationusingdecomposablemodels.InACLLasCruces,NM,1994.[DGR01]A.Deshpande,M.Garofalakis,andR.Rastogi.IndependenceisGood:Dependency-BasedHistogramSynopsesforHigh-DimensionalData.InSIGMOD,May2001.[Edw95]D.Edwards.IntroductiontoGraphicalModel-ing.Springer-Verlag,NewYork,NY,1995.[FL89]M.FrydenbergandS.L.Lauritzen.Decompo-sitionofmaximumlikelihoodinmixedinterac-tionmodels.Biometrika,76:539{555,1989.[GG99]P.GiudiciandP.J.Green.Decompos-ablegraphicalGaussianmodeldetermination.Biometrika,1999.[GHP95]P.Galinier,M.Habib,andC.Paul.Chordalgraphsandtheircliquegraphs.In,1995.[Hec98]D.Heckerman.AtutorialonlearningwithBayesiannetworks.InM.I.Jordan,editor,LearninginGraphicalModels,pages301{354.MITPress,Cambridge,MA,1998.[Hin99]G.E.Hinton.Productsofexperts.InProceed-ingsoftheNinthInternationalConferenceonArticialNeuralNetworks,1999.[JJ94]FinnV.JensenandFrankJensen.Optimaljunctiontrees.In,1994.[Lau96]S.L.Lauritzen.GraphicalModels.ClarendonPress,Oxford,1996.[Mal91]FrancescoM.Malvestuto.ApproximatingDis-creteProbabilityDistributionswithDecom-posableModels.IEEETransactionsonSys-tems,Man,andCybernetics,1991.[MJ00]M.MeilaandM.I.Jordan.Learningwithmix-turesoftrees.JournalofMachineLearningResearch,2000.[PPL97]S.DellaPietra,V.DellaPietra,andJ.Laf-ferty.Inducingfeaturesofrandomelds.IEEETransactionsonPatternAnalysisandMachineIntelligence,19:380{393,1997.[Wer76]NannyWermuth.Modelsearchamongmulti-plicativemodels.Biometrics,1976.[ZWM97]S.C.Zhu,Y.N.Wu,andD.Mumford.Min-imaxentropyprincipleanditsapplicationtotexturemodeling.NeuralComputation,1997.AProofofTheorem2.2TheproofofcorrectnessofTheorem2.2involvestwoparts:(i)provingthatthegraphcreatedafteraddingsuchanedgeischordal,(ii)provingthatforanychordalgraphV;E)withcanbeobtainedstartingwithbyrepeatedapplicationofthistheorem.Beforeweproceedwiththisproof,wewillneedsomedenitions:Denition:eliminationorderonagraphisapermutationofitsvertices.Denition:Givenaneliminationorder,eliminationschemeforthisorderisasfollows:Startingwith,removefromthegraphandaddallpossibleedgesbetweentheneighborsof.Continuethiswithandsoonuntil.Itcanbeprovedthatthegraphobtainedbyaddingalltheseadditionaledgestoisatriangulated(chordal)graph.Denition:perfecteliminationorderforachordalgraphisaneliminationorderwhichdoesnotaddanyedgestothegraph.LemmaA.1AgraphischordaliithasaperfecteliminationorderLemmaA.2Thefollowingsearchalgorithm,calledLexicographicBreadth-First,ndsaperfecteliminationorderwhenappliedtoachordalgraph.Thealgorithmconstructsthereverseeliminationor-der,startingfromthelastvertexintheorder,whichischosenarbitrarily.Atanystep,thealgorithmchoosesavertexamongtheremainingverticesthathasthehighestorderedneighbor.Incaseofties,thenexthighestorderedneighborischeckedandsoon(hencethenamelexicographic).Anyremainingtiesarebro-kenarbitrarily. Giventheseresults,wecanproceedwithourproofs:TheoremA.1ThegraphconstructedbyaddinganedgebetweentwonodesandthatsatisfythepropertiesdescribedinTheorem2.2forsomechordal.Proof:WeshowthatthereexistsaneliminationorderthatcanbegeneratedbyLexicographicBreadth-algorithmanddoesnotrequireadditionofanyedges(andhence,thenewgraphischordal).Lettheoriginalgraphbe,thenewgraphbe,andtheseparatorbetweenand.Notethatmustbeaclique.Also,lettheneighborsof Recently,[GG99]independentlyderivedacharacteri-zationfortheedgesthatcanbeaddedforforwardselectionintermsofajunctiontreeofthemodel.Theircharacteriza-tioncanbeshowntobeequivalenttoourcharacterization. ,where;:::;nneighborsofnotconnectedtoNowconsiderthefollowingpartialeliminationorderfor:::;v;:::;n.WeclaimthatthisordercanbeachievedbytheLexBFSalgorithm,startingwith,ifthetiesarebrokencorrectlyandhence,itispartofaperfecteliminationorder.Thenodejustprecedingcanbeanyneighborofandhencewecanchooseittobe.Nownotethat,isconnectedtobothandandassuch,wecanbreakthenexttieinitsfavorirrespectiveoftherestofthenodesandsoonfor;:::;s.Now,afterthisisdone,therestoftheneighborsofwillfollowinsomeorder(thisorderisnotrelevanttousandwithoutlossofgenerality,wecanassumeittobeThispartialeliminationordercanbereplacedby;:::;n;:::;swithoutcompromisingthe\perfect-ness"oftheeliminationorder,sinceitdoesnotaddanynewedgesunlesstheearlierelim-inationorderisnotperfect.Finally,thiseliminationorderisperfectforaswellandhence,ischordal. TheoremA.2Giventwochordalgraphs,V;EandV;E),with,let,itispossibletoreachfrombyrepeatedapplicationofTheorem2.2.Proof:Thistheoremfollowsfromtheproofthatthebackwardselectionprocedureisexhaustive[FL89].BProofofCorrectnessoftheCliqueGraphUpdateAlgorithmForsimplicity,weagainassumethatarenotsubsetsofB.1CorrectnessofStep2:TheoremB.1Givenanedge(separator,if,thenthisedgeexistsinProof:Considerthegraph(Figure4).Sinceisaseparatorfornodesetsandtherewillbeatleasttwoconnectedcomponentsinthisgraph.Letconsistofallthenodesreachablefromsomenodeinandletconsistofthenodesreachablefromsomenodeinandletconsistofrestofthenodes.Soand.Clearlyunlessandand,theadditionofedge()doesnotaecttheedge( Figure4:SeparationofNow,assumeand.Inthatcase,separatesandandhence(thisfollowsistheminimalseparatorforthenodesand).But,sinceisreachablefromandisreachablefromseparatesandonlyifalsoseparatesand.Hence,.Therefore,whichcontradictsourassumption. TheoremB.2Givenanedge(separator,if,thenthisedgeisnotinonlyifisconnectedtoandisconnectedtoviceversaProof:andbedenedasinthepreced-ingproof.Thistheoremfollowsfromtheobservationthatunlessand(orviceversa),theadditionoftheedge()doesnothaveanyeectonseparabilityofand B.2CorrectnessofStep3:Weusethefollowingpropertyofcliquegraphsinprov-ingcorrectnessofthisstep:LemmaB.1[GHP95]Givenacliquegraph;CG),letbesuchthat,then:Thefollowingtheoremcompletestheproofofcorrect-nessforthisstep:TheoremB.3and,thenoneofthefollowingistrue:andand