/
Ecient Stepwise Selection in Decomposable Models Amol Ecient Stepwise Selection in Decomposable Models Amol

Ecient Stepwise Selection in Decomposable Models Amol - PDF document

briana-ranney
briana-ranney . @briana-ranney
Follow
399 views
Uploaded On 2015-05-25

Ecient Stepwise Selection in Decomposable Models Amol - PPT Presentation

berkeleyedu Minos Garofalakis Bell Laboratories 600 Mountain Avenue Murray Hill NJ 07974 minosresearchbelllabscom Michael I Jordan Computer Science Statistics University of California Berkeley CA 94720 jordancsberkeleyedu Abstract In this paper we p ID: 74384

berkeleyedu Minos Garofalakis Bell Laboratories

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Ecient Stepwise Selection in Decomposabl..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

EcientStepwiseSelectioninDecomposableModels AmolDeshpandeComputerScienceDepartmentUniversityofCaliforniaBerkeley,CA94720amol@cs.berkeley.eduMinosGarofalakisBellLaboratories600MountainAvenueMurrayHill,NJ07974minos@research.bell-labs.comMichaelI.JordanComputerScience&StatisticsUniversityofCaliforniaBerkeley,CA94720jordan@cs.berkeley.eduAbstractInthispaper,wepresentanecientalgo-rithmforperformingstepwiseselectionintheclassofdecomposablemodels.Wefo-cusontheforwardselectionprocedure,butwealsodiscusshowbackwardselectionandthecombinationofthetwocanbeperformedeciently.Themaincontributionsofthispaperare(1)asimplecharacterizationfortheedgesthatcanbeaddedtoadecompos-ablemodelwhileretainingitsdecomposabil-ityand(2)anecientalgorithmforenumer-atingallsuchedgesforagivendecomposablemodelin)time,whereisthenumberofvariablesinthemodel.Wealsoanalyzethecomplexityoftheoverallstepwiseselectionprocedure(whichincludesthecomplexityofenumeratingeligibleedgesaswellasthecomplexityofdecidinghowto\progress").WeusetheKLdivergenceofthemodelfromthesaturatedmodelasourmetric,buttheresultswepresenthereextendtomanyothermetricsaswell.1IntroductionUndirectedgraphicalmodelshavebecomeincreasinglypopularinareassuchasinformationretrieval,statis-ticalnaturallanguageprocessing,andvision,wheretheyareoftenreferredtoasmaximumentropymodels,andareviewedashavingvariousrepre-sentationalandstatisticaladvantages.Newtoolsformodelselectionandparameterestimationarebeingdevelopedbyresearchersintheseareas[PPL97,Hin99, Partoftheworkwasdonewhiletheauthorwasvis-itingBellLaboratories.ZWM97].Generalundirectedmodels,however,havesomeseriousdisadvantages,inparticulartheyrequireaninvocationofIterativeProportionalFitting(orre-latediterativealgorithms)to ndmaximumlikelihoodestimates,eveninthecaseoffully-observedgraphs.Astheinnerloopofmoregeneralparameterestimationormodelselectionprocedures(e.g.,theMstepofanEMalgorithm),theseiterativealgorithmscanimposeseriousbottlenecks.Decomposablemodelsarearestrictedfamilyofundi-rectedgraphicalmodelsthathaveanumberofap-pealingfeatures:(1)maximumlikelihoodestimatescanbecalculatedanalyticallyfrommarginalproba-bilities,obviatingtheneedforIterativeProportionalFitting,(2)closedformexpressionsforteststatisticscanbefound,and(3)thereareseveralusefullinkstodirectedmodels(everydecomposablemodelhasarepresentationaseitheranundirectedoradirectedmodel),inferencealgorithms(decomposablemodelsareequivalenttotriangulatedgraphs),andacyclicdatabaseschemes[BFMY83].Decomposablemodelswouldthereforeseemtoprovideausefullyconstrainedrepresentationinwhichmodelselectionandparame-terestimationmethodscanbedeployedinlarge-scaleproblems.Moreover,mixturesofdecomposablemod-elsprovideanaturalupgradepathiftherepresenta-tionalstricturesofdecomposablemodelsareconsid-eredtoosevere.Althoughdecomposablemodelsareasubclassofundi-rectedgraphicalmodels,theproblemof ndingtheoptimaldecomposablemodelforagivendatasam-pleisknowntobeintractableandheuristicsearchtechniquesaregenerallyused[Edw95].Mostproce-duresarebasedonsomecombinationof(i)forwardselection,inwhichwestartwithasmallmodelandaddedgesaslongasanappropriatescorefunctionincreases[Hec98],and(ii)backwardselection,where startingwithalargermodel,edgesaredeletedfromthemodel.Sincetheinterveningmodelsencounteredinthesearchmustalsobedecomposable,caremustbetakensuchthatdeletionoradditionofedgesdoesnotresultinanon-decomposablemodel.Backwardselectionproceduresfordecomposablemodelsarewellknownintheliterature[Wer76,Lau96],butecientforwardselectionprocedureshavenotyetbeendevel-oped.Oneofthegoalsofthecurrentpaperisto llthisgap.Thispaperisatheoreticalpaperthatmakestwomaincontributions.First,weprovideasimplecharacteriza-tionoftheedgesthatcanbeaddedtoadecomposablemodel(or,equivalently,thechordalgraphcorrespond-ingtothemodel)whileresultinginanotherdecom-posablemodel.Second,basedonthischaracterizationwepresentanecientalgorithmforenumeratingallsuchedgesforthecurrentmodelin)time,whereisthenumberofattributes.Weprovideacarefulanalysisoftherunningtimecomplexityoftheoverallforwardselectionprocedure,includingthetimetakenforchoosingwhichoftheeligibleedgestoaddtothecurrentmodel.WeusetheminimizationofKLdiver-genceasourmetric,buttheresultswepresentcanbeextendedtoanyotherlocallycomputablemetric(e.g.,[GG99]).Thoughourmainfocusisthenewforwardselectionprocedure,wealsoshowthatthealgorithmsareeasilyextendedtobackwardselectionortoacombinationofforwardandbackwardselection.Thetechniquesanddatastructuresweproposealsonaturallyextendtotheproblemof ndingstronglydecomposablemodelsinmixedgraphs.Theremainderofthepaperisorganizedasfollows.InSection2,wederiveasimplecharacterizationfortheedgesthatcanbeaddedtoachordalgraphwhilemaintainingitschordality.InSection3,wedescribethedatastructurethatweusefor ndingsuchedgesecientlyanddiscusshowitismaintainedinthepres-enceofadditionstotheunderlyingmodelgraph.InSection4,weanalyzetheoverallcomplexityofthestepwiseselectionproceduresfortheKLdivergencemetric.Webrie ydiscusshowthedatastructurescanbeextendedfordoingbackwardselectionasmightberequiredforaprocedurethatalternatesbetweenfor-wardandbackwardselectioninSection5.InSection6,wediscusshowthesealgorithmscanbeextendedforthecaseofmixedgraphsandstrongdecomposability.WeconcludewithSection7. b va Figure1:Structureofthesubgraphinducedbyand2CharacterizingEdgesEligibleforStepwiseSelectionThereisaclassicalcharacterizationoftheedgesthatcanbedeletedfromachordalgraphsuchthattheresultinggraphremainschordal:Theorem2.1[Wer76,Lau96]Givenachordalgraph,anedgecanbedeletedfromthegraphwhilemain-tainingitschordalityi theedgebelongstoexactlyoneofthemaximalcliquesofthegraph.Tocomplementthisresult,weproposethefollow-ingcharacterizationoftheedgesthatcanaddedachordalgraphwithoutviolatingitschordality(theproofisgiveninAppendixA):Theorem2.2GivenachordalgraphV;E),anedge(canbeaddedtothegraphwhilemaintainingitschordalityi itsatis esthefollowingproperties:1.Thereexistsasubsetofnodes,suchthat(andareconnectedtoallverticesin2.Thesetistheminimalseparatorforand(notethat,sinceischordal,thisimpliesthatisaclique);,removingthenodesinandalltheirincidentedgesfromseparatesandandnopropersubsetofhasthisproperty.3EnumeratingEligibleEdgesforForwardSelectionInthissection,wedescribehowtoenumeratealledgesthatcanbeaddedtothecurrentchordalgraphwhilemaintainingchordality.Forthispurpose,we Forsimplicity,weusethetermsdecomposablemodelandchordalgraphinterchangeably. maintaintwoauxiliarydatastructures(i)acliquegraph[GHP95]correspondingtothecurrentchordalgraph,(ii)abooleanmatrix,,indexedbytheattributesofdatawhichmaintainsinformationaboutwhetherapairofnodesiseligibleforaddition.Clearly,,wecanenumeratetheedgeseligibleforfor-wardselectionin)time,whereisthenumberofattributes.Thecliquegraphisrequiredtoupdate)timewhenanedgeisaddedtotheunder-lyinggraph,aswewillseelaterinthesection.3.1CliqueGraph3.1.1De nitionandPropertiesDe nition:cliquegraphofachordalgraphV;E),denotedby,hasthemaximalcliquesofthechordalgraphasitsvertices,andhastheprop-ertythatgivenanytwomaximalcliques,andthereisanedgebetweenthesetwocliquesi separatesthenodesetsandNotethatthisusageoftheterm\cliquegraph"isnon-standard;wearefollowingtheterminologyof[GHP95].Wewillnotesomepropertiesofcliquegraphs.Lemma3.1Twonodeswith(satisfytheforwardselectioncharacterization(Section2.2)i thereexistsanedge()insuchthatandLemma3.2[GHP95]Amaximumspanningtreeofacliquegraphofachordalgraph,wheretheweightofanedgeisthesizeoftheintersectionofthetwocliquesitjoins,isajunctiontreeofthechordalgraph.Lemma3.3ThenumberofnodesinacliquegraphofachordalgraphV;E)isatmost3.2UpdatingtheCliqueGraphAfterallpossibleedgesareenumeratedandoneedgeischosenbasedonamodelselectioncriterion,weneedtoupdatethecliquegraphtore ecttheadditionofthisnewedgetotheunderlyingchordalgraph.Inthissection,weshowhowthiscanbedonein)time. NotealsothatthecliquegraphofachordalgraphisequivalenttoanAlmondJunctionTree[AK93,JJ94]ofthegraph;bothofthesestructurescanbeseenascompactrepresentationsofallpossiblejunctiontreesofachordalgraph.3.2.1UpdateAlgorithmV;E)betheoriginalchordalgraphandletbethecliquegraph.Letthenewedgethatisaddedbe().Also,let()bethecorrespondingedgeinthecliquegraph(theedgefromwhichthispairofnodeswasobtained)andletFinally,letandbethenewmodelandcliquegraphs.Additionoftheedge()createsanewmaximumasshowninFigure2.Itispossiblethat(or),inwhich)willnotbeamaximalcliqueinthenewchordalgraph.Weassumefornowthatthisdoesnothappen.Inthatcase,addingtheedge()toresultsinapartialcliquegraphstructureasshowninFigure2.Notethatthenewcliquewillbeconnectedand Figure2:PartialStructureoftheCliqueGraphafteraddingtheedge(Afterthisnodehasbeencreatedandaddedtothecliquegraph,theupdatealgorithmhastodeletethoseedgesfromthecliquegraphwhichdonotsatisfythecliquegraphproperty(Section3.1.1).Itiseasytoseethatnonewedgesbetweenalreadyexistingmaximalcliquesinthecliquegraphwillhavetobeadded.Giventhis,theupdatealgorithmisasfollows(seeAppendixBforthecorrectnessproof):1.Let.Findallnodesthatarecon-nectedtoandtoandmaintainthisinformationintwoarraysindexedbytheverticesDecidingwhethertokeepanedge:Let.If,thenkeepthisedge.Otherwise,considerthegraph).Ifisconnectedtointhisgraphandisconnectedtovice ,donotkeepthisedgein.Otherwise,keepit.Thischeckcanbeperformedin(1)timeusingthearraysconstructedinStep1.Addingedgesinvolving1.Foreverymaximalcliquesuchthat(afterexecutionoftheabovestep),addanedge()toif(2.Foreverymaximalcliquesuchthat,explicitlycheckifthisedgeshouldbeaddedtothecliquegraph.Thisre-quirescheckingwhetherisseparatedfrom,whichcanbedonebylookingupthearrayscomputedinStep1.3.Repeattheabovetwostepsforandinsteadofand4.Remove)ifitiscontainedin V;E,currentmodel&cliquegraphsV;E,newmodel&cliquegraphsbegin1.using,choosewhichoftheedgestoaddto2.let()bethenewedgetobeadded,letbethecorrespondingedgeinandlet G�.FindallnodesconnectedtoandtoandmaintainthisinformationintwoarraysindexedbytheverticesofV;E6.delete()from,addanewnodeandaddedges()and(foralledges(equaltoisconnectedtoandisconnectedtoorviceversa10.deletetheedge()fromforalls.t.(14.addtheedge()toforalls.t.16.checkifseparatesfrom17.ifyes,addtheedge()to18.repeatSteps11-17forandinsteadofand19.remove(similarly,)from20.updateasdescribedinSection3.3Figure3:CompleteForwardSelectionAlgorithm 3.2.2ComplexityAnalysisItiseasytoseethatSteps1and3canbeperformed)time,butcomparisonoftwosetstocheckiftheyareequalis)ingeneral.Assuch,thecom-plexityofStep2mightseemtobe.Toperformthisstepeciently,wemaintainabinaryindexontheminimalseparatorsofthecurrentmodelgraph.Aleafofthisindexcontainsalistofpointerstoalltheedgesthathaveassociatedwiththemtheminimalseparatorcorrespondingtotheleaf.Nowgiven,wecan ndalltheedgeswithseparatorequaltointime(sincetheheightoftheindexis).AswementionedinStep2,weonlyneedtochecksuchedgessincetherestoftheedgesremainuntouched.Thedeletions,ifrequired,canbeperformedin(1)timeperdele-tion.Thus,giventhisdatastructure,Step2canbeperformedintime)(actually,inordertime,whereisthenumberofedgesinseparator).Insertionsinthisindexcanbedonein)time,byaddingapointertothenewedgeatthefrontofthelistofpointers.SincetherecanbeatmostinsertionstothisindexinStep3,thecomplexityofStep3remains3.3Updatingisupdatedasfollows:1.Settheentry()tofalse,sincethisedgeisnoteligibleforadditionanymore.2.Sincesomeedgesaredeletedfromthecliquegraph,thecorrespondingpairsofnodesthatmighthavebeeneligibleforadditionbeforemaynotbeeligibleanymore.Notingthatallsuchdeletedcliquegraphedgesmusthavethesepara-tor,wesettheentrycorrespondingtopairs)suchthatisconnectedtoandconnectedto,tofalse.Thiscanbedonebyusingthearrayscomputedduringthecliquegraphupdatealgorithmin)time.3.Someedgesoftheform()or()maybenoweligibleforaddition.Tosetthecorrespondingentriestotrue,foreveryedgethatisaddedtothecliquegraph(),settotruetheentriescorrespondingto(.Similarlyfor(pleaserefertoTheorem4.3fortheproofthattheseincludealltheedgesnewly Inmanycases,modelsearchisperformedamongmod-elswitharestrictedmaximumcliquesize.Inthatcase,setcomparisonstakeconstanttimeandthedatastructurewedescribenextisnotrequired. eligibleforaddition.).Thus,canalsobeupdatedin)time.4ComplexityofOverallForwardSelectionProcedureTheoverallforwardselectionprocedurehastwoparts:1.Enumeratingalledgesthatcanbeaddedtothecurrentmodel.2.Decidingwhichoftheeligibleedgestoaddtothemodelfor\forwardprogress".Thusfar,wehavepresentedan)solutiontothe rstproblem,givingusanoverallcomplexityof),ifatotalofedgesareaddedstartingwiththenullmodel.Inthissection,weanalyzethecom-plexityofthesecondstepwiththeaimofminimiz-ingKLdivergencefromthesaturatedmodel,whichisequivalenttomaximumlikelihoodestimation.Theorem4.1[Mal91]MinimizingtheKLdivergenceofagivenmodelfromthesaturatedmodelisequiv-alenttominimizingtheentropyofthemodelwhichisde nedas:)=whereisthesetofmaximumcliquesofthemodel,andisthesetofseparatorsforajunctiontreeofthemodel.Theentropyofasubsetofattributes,;:::;A),isde nedtobe:)= log where)denotesthedomainofattributedenotesthenumberofdataitemswhichcontainandisthetotalnumberofdatapoints.Thus,theonlyinformationweneedfromadecompos-ablemodel,tocomputeitsdivergencefromthesatu-ratedmodel,isalistofitsmaximalcliquesandalistofseparatorsinsomejunctiontreeforthemodel.Manymetricsforgraphicalmodelshaveasimilarpropertyandthispropertycanbeusedtoavoidrecomputationofajunctiontreeforthenewmodelobtainedafteranincrementalchangetothecurrentmodel:Theorem4.2IftwodecomposablemodelsMMdi eronlyinoneedge(),(i.e.,(and),thenthemaximalcliquesandthemin-imalseparatorsinjunctiontreescorrespondingtothetwomodels,)andwherearethelistsofmaximalcliquesandarethelistsofseparators,di erasfollows:1.Ifand,thenand2.Ifand,thenand3.Ifand,thenand4.Ifand,thenandThisimmediatelygivesusthefollowingcorollary:Corollary4.1IftwodecomposablemodelsMMdi eronlyinoneedge(),(i.e.,(and),thenwhereistheminimalseparatorofandThusthechangeintheentropyofthemodelaf-teradding(ordeleting)anedgeisonlydependentontheminimalseparatorofthetwoverticesinthemodelgraphandassuch,thisinformationcanbepre-computedforallpairsofnodesthatareeligibleforforwardselection.Duringtheforwardselectionproce-dure,these\changes"canbeassociatedwiththecor-respondingedgeinthecliquegraph.Thisalsomeansthatafteraddinganewedgetothemodelgraph,weonlyhavetocomputenewentropiescorrespondingtothenewedgesthatareaddedtothecliquegraph.Inthenexttheorem,weboundthetotalnumberofen-tropiesthathavetocomputedafteradditionofanedgetothemodelgraph:Theorem4.3Thenumberofnewentropiesthatneedtobecomputedafteraddingtheedge(totheunderlyingmodelgraph(afterperformingoneforwardselectionstep)isatmostto2(),wherearethenumberofneighborsofandrespectively.Proof:Anynewedgethatisaddedtothecliquegraphmusthaveasoneofitsendpoints.Consideronesuchedge,(),withassociatedseparator.Thisedgemustsatisfyoneofthefollowingproperties(cf.TheoremB.3)::Inthiscase,thepairsofnodesare(possiblynewly)eligi-bleforforwardselection.Eachsuchedgerequirescomputationof)and).Outofthese,and)havealreadybeencomputedandweonlyhavetocompute)and :Theedge()belongstoandalsoto.Thisimpliesthatpairsofnodes(wereeligibleforadditioninaswellandassuch,theentropiesneededforthesepairshavealreadybeencomputed.Thus,theonlynewentropiesthatneedtobecomputedareforpairsofnodes,andagainweonlyneedtocompute)and:Theanalysisforthesetwocasesissimilar.Thus,theonlynewentropiesthatneedtobecomputedarethosecorrespondingtothepairsofnodesoftheform()or()andweneedtocomputeatmosttwoentropiesforeverysuchpair.Sincethereareatmostsuchpairs,thetotalnumberofnewentropiesthatneedtobecomputedisatmost)+2( Notethatthistheoremassumesthatalltheentropiesrequiredforforwardselectioninthecurrentmodelarealreadycomputed,andhenceitdoesnotapplytothevery rststep.Forexample,ifwestartwiththenullmodel(emptymodelgraph),thenweneedtocomputeentropiestodecidewhichedgetoaddinthe rststep.5BackwardSelectionInthissection,webrie youtlinehowtoextendourdatastructuresfordoingbackwardselectionasmightberequiredforaprocedurewhichalternatesbetweenforwardselectionandbackwardselection.Detailsaredeferredtothefullversionofthepaper.Forecientenumerationofedgeseligiblefordeletionandupdateofthecliquegraphdatastructures,weneedtomaketwochanges:Amatrixofsize,indexedbytheverticesofthemodel,ismaintainedforenumerationofedgeseligiblefordeletion.Theentryinthematrixcorrespondingtoapairofnodes(u;v)tellsuswhethertheedgeispresentinthemodel,andifyes,whetheritiseligiblefordeletion.Thebinaryindexontheseparatorsisaugmentedtokeeptheintersectionsetsforeverypossiblepairofmaximalcliques.Thisinformationisrequiredto\re-insert"thoseedgesinthecliquegraphthatmighthavebeendeletedinthereversestepofaddingthisedgetothegraph.Thealgorithmsforcliquegraphupdatedescribedear-lierhavetobemodi edtomaintainthesedatastruc-turesaswell,buttheseupdatescanalsobeperformed)timeduringbothbackwardselectionandfor-wardselection.Also,itcanbeprovedthat:Theorem5.1Thenumberofnewentropiesthatneedtobecomputedafterdeletinganedgefromtheunderlyingmodelgraph(afterperformingonebackwardselectionstep)isatmost(1)andthosearealloftheform),for6MixedGraphsTheresultsinthispaperextendreadilytothecaseofmixedgraphsandstrongdecomposability,viathefollowingtheorem:Theorem6.1[Lau96]Givenastronglydecompos-ablemixedgraphV;E),withwithVc,whereisthesubsetofverticescorrespondingtothecon-tinuousvariablesandisthesubsetofverticescor-respondingtothediscretevariables,thenthegraph),with,ischordal.7ConclusionsDecomposablemodelspossessseveralimportantchar-acteristicsthatmakethemanappealingclassofstatis-ticalmodels,ashasbeenobservedinappliedcontextsrangingfromwordsensedisambiguation[BW94]tomulti-dimensionalhistograms[DGR01].Eciental-gorithms,however,areessentialifthisclassofmodelsistobeexploitedinlarge-scaleproblems.Inthispa-per,wehavepresentedanecientnewalgorithmforperformingstepwiseselectionindecomposablemodels.Theenumerationofedgeseligibleforforwardorback-wardselectioncanbeaseriouspracticalbottleneckindecomposablemodels,andthenewalgorithmisasig-ni cantimprovementoverthenaiveproceduresthatarecurrentlyused(cf.[Edw95]).Togetherwithmeth-odsthatallowrapidcomputationofsucientstatis-tics,suchasthecubecomputationtechniquesdevel-opedinthedatabaseliterature[A96]andtechniquesthatexploitdatasparseness[MJ00],wefeelthatthesealgorithmsallowdecomposablemodels(andtheirex-tensiontomixturesofdecomposablemodels)tobe-comeanincreasinglyviablealternativeinlarge-scaleexploratorydataanalysisproblems. AcknowledgementsThisworkwassupportedbyCONTROL442427-21389,ONRMURIN00014-00-1{637andNSFgrantIIS-9988642.Referenceses+96]SameetAgarwaletal.Onthecomputationofmultidimensionalaggregates.In,1996.[AK93]R.AlmondandA.Kong.OptimalityissuesinconstructingaMarkovtreefromgraphicalmodels.ComputationalandGraphicalStatis-tics,1993.[BFMY83]C.Beeri,R.Fagin,D.Maier,andM.Yan-nakakis.Onthedesirabilityofacyclicdatabaseschemes.JournaloftheACM,1983.[BW94]R.BruceandJ.Wiebe.Word-sensedisam-biguationusingdecomposablemodels.InACLLasCruces,NM,1994.[DGR01]A.Deshpande,M.Garofalakis,andR.Rastogi.IndependenceisGood:Dependency-BasedHistogramSynopsesforHigh-DimensionalData.InSIGMOD,May2001.[Edw95]D.Edwards.IntroductiontoGraphicalModel-ing.Springer-Verlag,NewYork,NY,1995.[FL89]M.FrydenbergandS.L.Lauritzen.Decompo-sitionofmaximumlikelihoodinmixedinterac-tionmodels.Biometrika,76:539{555,1989.[GG99]P.GiudiciandP.J.Green.Decompos-ablegraphicalGaussianmodeldetermination.Biometrika,1999.[GHP95]P.Galinier,M.Habib,andC.Paul.Chordalgraphsandtheircliquegraphs.In,1995.[Hec98]D.Heckerman.AtutorialonlearningwithBayesiannetworks.InM.I.Jordan,editor,LearninginGraphicalModels,pages301{354.MITPress,Cambridge,MA,1998.[Hin99]G.E.Hinton.Productsofexperts.InProceed-ingsoftheNinthInternationalConferenceonArti cialNeuralNetworks,1999.[JJ94]FinnV.JensenandFrankJensen.Optimaljunctiontrees.In,1994.[Lau96]S.L.Lauritzen.GraphicalModels.ClarendonPress,Oxford,1996.[Mal91]FrancescoM.Malvestuto.ApproximatingDis-creteProbabilityDistributionswithDecom-posableModels.IEEETransactionsonSys-tems,Man,andCybernetics,1991.[MJ00]M.MeilaandM.I.Jordan.Learningwithmix-turesoftrees.JournalofMachineLearningResearch,2000.[PPL97]S.DellaPietra,V.DellaPietra,andJ.Laf-ferty.Inducingfeaturesofrandom elds.IEEETransactionsonPatternAnalysisandMachineIntelligence,19:380{393,1997.[Wer76]NannyWermuth.Modelsearchamongmulti-plicativemodels.Biometrics,1976.[ZWM97]S.C.Zhu,Y.N.Wu,andD.Mumford.Min-imaxentropyprincipleanditsapplicationtotexturemodeling.NeuralComputation,1997.AProofofTheorem2.2TheproofofcorrectnessofTheorem2.2involvestwoparts:(i)provingthatthegraphcreatedafteraddingsuchanedgeischordal,(ii)provingthatforanychordalgraphV;E)withcanbeobtainedstartingwithbyrepeatedapplicationofthistheorem.Beforeweproceedwiththisproof,wewillneedsomede nitions:De nition:eliminationorderonagraphisapermutationofitsvertices.De nition:Givenaneliminationorder,eliminationschemeforthisorderisasfollows:Startingwith,removefromthegraphandaddallpossibleedgesbetweentheneighborsof.Continuethiswithandsoonuntil.Itcanbeprovedthatthegraphobtainedbyaddingalltheseadditionaledgestoisatriangulated(chordal)graph.De nition:perfecteliminationorderforachordalgraphisaneliminationorderwhichdoesnotaddanyedgestothegraph.LemmaA.1Agraphischordali ithasaperfecteliminationorderLemmaA.2Thefollowingsearchalgorithm,calledLexicographicBreadth-First, ndsaperfecteliminationorderwhenappliedtoachordalgraph.Thealgorithmconstructsthereverseeliminationor-der,startingfromthelastvertexintheorder,whichischosenarbitrarily.Atanystep,thealgorithmchoosesavertexamongtheremainingverticesthathasthehighestorderedneighbor.Incaseofties,thenexthighestorderedneighborischeckedandsoon(hencethenamelexicographic).Anyremainingtiesarebro-kenarbitrarily. Giventheseresults,wecanproceedwithourproofs:TheoremA.1ThegraphconstructedbyaddinganedgebetweentwonodesandthatsatisfythepropertiesdescribedinTheorem2.2forsomechordal.Proof:WeshowthatthereexistsaneliminationorderthatcanbegeneratedbyLexicographicBreadth-algorithmanddoesnotrequireadditionofanyedges(andhence,thenewgraphischordal).Lettheoriginalgraphbe,thenewgraphbe,andtheseparatorbetweenand.Notethatmustbeaclique.Also,lettheneighborsof Recently,[GG99]independentlyderivedacharacteri-zationfortheedgesthatcanbeaddedforforwardselectionintermsofajunctiontreeofthemodel.Theircharacteriza-tioncanbeshowntobeequivalenttoourcharacterization. ,where;:::;nneighborsofnotconnectedtoNowconsiderthefollowingpartialeliminationorderfor:::;v;:::;n.WeclaimthatthisordercanbeachievedbytheLexBFSalgorithm,startingwith,ifthetiesarebrokencorrectlyandhence,itispartofaperfecteliminationorder.Thenodejustprecedingcanbeanyneighborofandhencewecanchooseittobe.Nownotethat,isconnectedtobothandandassuch,wecanbreakthenexttieinitsfavorirrespectiveoftherestofthenodesandsoonfor;:::;s.Now,afterthisisdone,therestoftheneighborsofwillfollowinsomeorder(thisorderisnotrelevanttousandwithoutlossofgenerality,wecanassumeittobeThispartialeliminationordercanbereplacedby;:::;n;:::;swithoutcompromisingthe\perfect-ness"oftheeliminationorder,sinceitdoesnotaddanynewedgesunlesstheearlierelim-inationorderisnotperfect.Finally,thiseliminationorderisperfectforaswellandhence,ischordal. TheoremA.2Giventwochordalgraphs,V;EandV;E),with,let,itispossibletoreachfrombyrepeatedapplicationofTheorem2.2.Proof:Thistheoremfollowsfromtheproofthatthebackwardselectionprocedureisexhaustive[FL89].BProofofCorrectnessoftheCliqueGraphUpdateAlgorithmForsimplicity,weagainassumethatarenotsubsetsofB.1CorrectnessofStep2:TheoremB.1Givenanedge(separator,if,thenthisedgeexistsinProof:Considerthegraph(Figure4).Sinceisaseparatorfornodesetsandtherewillbeatleasttwoconnectedcomponentsinthisgraph.Letconsistofallthenodesreachablefromsomenodeinandletconsistofthenodesreachablefromsomenodeinandletconsistofrestofthenodes.Soand.Clearlyunlessandand,theadditionofedge()doesnota ecttheedge( Figure4:SeparationofNow,assumeand.Inthatcase,separatesandandhence(thisfollowsistheminimalseparatorforthenodesand).But,sinceisreachablefromandisreachablefromseparatesandonlyifalsoseparatesand.Hence,.Therefore,whichcontradictsourassumption. TheoremB.2Givenanedge(separator,if,thenthisedgeisnotinonlyifisconnectedtoandisconnectedtoviceversaProof:andbede nedasinthepreced-ingproof.Thistheoremfollowsfromtheobservationthatunlessand(orviceversa),theadditionoftheedge()doesnothaveanye ectonseparabilityofand B.2CorrectnessofStep3:Weusethefollowingpropertyofcliquegraphsinprov-ingcorrectnessofthisstep:LemmaB.1[GHP95]Givenacliquegraph;CG),letbesuchthat,then:Thefollowingtheoremcompletestheproofofcorrect-nessforthisstep:TheoremB.3and,thenoneofthefollowingistrue:andand