/
Decreasingly naive Bayes Aggregating n-dependence estimators Decreasingly naive Bayes Aggregating n-dependence estimators

Decreasingly naive Bayes Aggregating n-dependence estimators - PDF document

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
414 views
Uploaded On 2017-04-04

Decreasingly naive Bayes Aggregating n-dependence estimators - PPT Presentation

Theparadigmisoftheoreticalinterestbecauseitshowsthatthereisafundamentalalternativetothedominantapproachtoclassi cationlearningThedominantapproachperformssearchthroughahypothesisspacetoidentifythehyp ID: 335498

Theparadigmisoftheoreticalinterestbecauseitshowsthatthereisafun-damentalalternativetothedominantapproachtoclassi cationlearning.Thedominantapproachperformssearchthroughahypothesisspacetoidentifythehyp

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Decreasingly naive Bayes Aggregating n-d..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

DecreasinglynaiveBayes:Aggregatingn-dependenceestimatorsGeo reyI.WebbJaniceR.BoughtonFeiZhengKaiMingTingHoussamSalemFacultyofInformationTechnology,MonashUniversity,VIC3800,AustraliafGeo .Webb,Janice.Boughton,Kaiming.Tingg@monash.eduKNOWSYSLabTechnicalReport2010-01,MonashUniversity,2010AbstractAveragedn-DependenceEstimators(AnDE)isanapproachtoprob-abilisticclassi cationlearningthatlearnswithoutsearch.Itutilizesasingleparameterthattransformstheapproachbetweenalow-variancehigh-biaslearner(NaiveBayes)andahigh-variancelow-biaslearnerwithBayesoptimalasymptoticerror.ItextendstheunderlyingstrategyofAveragedOne-DependenceEstimators(AODE),whichrelaxestheNaiveBayesindependenceassumptionwhileretainingmanyofNaiveBayes'de-sirablecomputationalandtheoreticalproperties.AnDEfurtherrelaxestheindependenceassumptionbygeneralizingAODEtohigher-levelsofde-pendence.Extensiveexperimentalevaluationshowsthatthebias-variancetrade-o forAveraged2-DependenceEstimatorsresultsinstrongpredic-tiveaccuracyoverawiderangeofdatasets.Ithastrainingtimelinearwithrespecttothenumberofexamples,supportsincrementallearning,handlesdirectlymissingvalues,andisrobustinthefaceofnoise.Beyondthepracticalutilityofitslower-ordervariants,AnDEisofinterestinthatitdemonstratesthatitispossibletocreatelow-biashigh-variancegener-ativelearnersandsuggestsstrategiesfordevelopingevenmorepowerfulclassi ers.1IntroductionThispaperpresentsanalternativetotheclassicalclassi cationlearningparadigmoflearningassearch.Thisalternativeparadigmsupportslearningwithoutsearch.Giventhelargenumberofexistingclassi cationlearningalgorithms,thereaderwouldbeforgivenforwonderingwhythereisaneedforanewal-gorithm,letaloneforanotherlearningparadigmandresultingfamilyofalgo-rithms.Thisalternativeparadigmisofpotentialinteresttothecommunityforboththeoreticalandpracticalreasons.1 Theparadigmisoftheoreticalinterestbecauseitshowsthatthereisafun-damentalalternativetothedominantapproachtoclassi cationlearning.Thedominantapproachperformssearchthroughahypothesisspacetoidentifythehypothesisthatoptimizessomeobjectivefunctionwithrespecttothetrainingdata.Thisalternativeperformsnosearch.Theparadigmisofpracticalinterestbecauseitgivesrisetoafamilyofal-gorithmswithauniquecombinationoffeaturesthatisbewellsuitedtomanyapplications.Wediscussthesefeaturesinmoredetailbelow.Notableamongstthemaretrainingcomplexitylinearwithrespecttothenumberoftrainingexam-ples;directcapacityforincrementallearning;andaccuracythatiscompetitivewiththestate-of-the-art.Thefamilycontainsarangeofalgorithmsthatrangefromlowvariancecou-pledwithhighbiasthroughtohighvariancecoupledwithlowbias.Successivemembersofthefamilywillbebestsuitedtodi eringquantitiesofdata,start-ingwithlowvarianceforsmalldata,withsuccessivelylowerbiasbuthighervariancesuitingeverincreasingdataquantities[1].TheasymptoticerrorofthelowestbiasvariantisBayesoptimal.Onememberofthisfamilyofalgorithms,naiveBayes(NB),isalreadywellknown.Asecondmember,AveragedOne-DependenceEstimators(AODE)[2],hasenjoyedconsiderablepopularitysinceitsintroductionin2005[3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25].Theworkpre-sentedinthispaperarisesfromtherealizationthatNBandAODEarebuttwoinstancesofafamilyofalgorithms,whichwecallAnDE.TheAnDEalgorithms,includingNBandAODE,operatebyafundamentallydi erentprincipletothemajorityofclassi cationlearningalgorithms.Ratherthanperformingsearchtoselectahypothesisabouttherelationshipbetweentheattributesandtheclass,theAnDEalgorithmsutilizeaprede nedfunctiontoextrapolatefromobservedlow-levelrelationshipstotherequiredmultivariaterelationship.InSection2weexplainhowitispossibletolearnwithoutsearch,andde netheAnDEfamilyofalgorithms.TheAnDEfamilyofalgorithmsbuilduponthemethodpioneeredbyAODE[2].InSection3wediscusshowtheAnDEalgorithmsrelatetoFeating[26],agenericapproachtoensemblingthatalsobuildsupontechniquespioneeredbyAODE.InSection4wepresentanexten-siveevaluationoftheAnDEfamilyofalgorithms,comparingtheirperformancetorelevantBayesiantechniques,toFeatingandtothestate-of-the-artRan-domForestclassi er.Section5presentsconclusionsanddirectionsforfutureresearch.2Learningwithoutsearch:TheAnDEfamilyofalgorithmsWewishtoestimatefromatrainingsampleTofclassi edobjectstheprobabilityP(yx)thatanexamplex=hx1;:::;xaibelongstoclassy,wherexiisthevalueoftheithattributeandy2c1;:::ck,whicharethekclasses.Weusevtodenote2 theaveragenumberofvaluesperattribute.Fromthede nitionofconditionalprobabilitywehaveP(yx)=P(y;x)=P(x)(1)AsP(x)=Pki=1P(cix),wecanalwaysestimateEq1fromestimatesofP(y;x)foreachclassusingP(y;x)=P(x)=P(y;x)=kXi=1P(cix)(2)Inconsequence,intheremainderofthispaperweconsideronlytheproblemofestimatingP(y;x),therebysettingtheworkinagenerativeframework.Wede netheorderofaprobabilityorprobabilityestimateasthenumberofattributesinthedistributiontowhichtheprobabilityorestimaterelates.Hence,theorderofP(cix)isa+1.IfthetrainingdatadoesnotcontainsucientexamplesofxtodirectlyderiveaccurateestimatesofeachP(cix),wemustextrapolatetheseestimatesfromobservationsoflower-orderstatisticsofthedata.Allotherthingsbeingequal,anestimateofalower-orderprobabilityfromagiven nitetrainingsetwillbemoreaccuratethananestimateofahigher-orderprobability,andestimatesofhigher-orderprobabilitieswillvarymorefromtrainingsampletotrainingsam-ple.Hence,modelsderivedfromlower-orderprobabilityestimatesarelikelytohavelowervariancethanmodelsderivedfromhigher-orderprobabilityesti-mates.Ontheotherhand,modelsderivedfromhigher-orderprobabilitiesarelikelytohavelowerbias,aslessrestrictiveassumptionsaremadeabouttheformoftheprobabilitydistribution.ThisisillustratedinFigure1,thatshowsasimpleattribute-spacewiththreeternaryattributesandabinaryclass.Toclassifyanewobjectwithattribute-valuesAge=Old,Pulse=SlowandTemperature=High,onewishestoinfertheclassdistributioninthecellhighlightedinFigure1(a),whichisafourth-orderprobabilitydistribution.Ifthereareinsucientexamplestodirectlyestimatethatdistribution,itmightbeextrapolatedfromanyofanumberoflowerorderprobabilitydistributions.ThepriorclassdistributionP(y)isa rst-orderprobabilitydistributionthatcanbeestimatedfromtheentireattribute-space(Figure1(b)).Thesecond-orderprobabilitiesP(y^Age=Old),P(y^Pulse=Slow),P(y^Temperature=High)canbeestimatedfromtheregionsdepictedin(Figure1(c-e)).Theregionsassociatedwiththethird-orderproba-bilitiesP(y^Age=Old^Pulse=Slow),P(y^Age=Old^Temperature=High)andP(y^Pulse=Slow^Temperature=High)areillustratedin(Figure1(f-h)).Byapplicationoftheproductrulewehavethefollowing.P(y;x)=P(y)P(xy)(3)Ifthenumberofclasses,k,issmall,itshouldbepossibletoobtainasucientlyaccurateestimateofP(y)fromthesamplefrequencies.However,westillhave3 1 1P(y^Age=Old^Pulse=Slow^Temperature=High)P(y)(a)(b) 1 1 1P(y^Age=Old)P(y^Pulse=Slow)P(y^Temperature=High)(c)(d)(e) 1 1 1P(y^Age=Old^Pulse=Slow)P(y^Age=Old^Temperature=High)P(y^Pulse=Slow^Temperature=High)(f)(g)(h)Figure1:Statisticsofvaryingorderforanattribute-spacewiththreeternaryattributesandabinaryclass4 theproblemthatxmaynotoccursucientlyfrequentlyinthetrainingdataandhenceaccurateestimatesofP(xy)cannotbeobtaineddirectlyfromthesamplefrequencies.ThesolutionusedbyNBistoextrapolatetoP(xy)fromeachsecondorderprobabilityP(xiy)byassumingtheattributesareindependentgiventheclass.FromthisassumptionitfollowsthatP(xy)=aYi=1P(xiy)(4)HenceweclassifyusingNB(y;x)=P(y)aYi=1P(xiy)(5)WithreferencetoFigure1,NBestimatesthedistributionin(a)byextrap-olationfromthedistributionsin(b)(givingP(y)),(c)(givingP(Age=oldy)),(d)(givingP(Pulse=slowy))and(e)(givingP(Temperature=highy)).Theindependenceassumptionisaverystrongassumptionabouttheunder-lyingprobabilitydistribution.Asaresult,NBhasveryhighbias.However,duetotheloworderofthebasestatisticsfromwhichthemodelisestimated,ithaslowvariance.2.1AODEAveragedOne-DependenceEstimators(AODE)[2]extendstothird-orderprob-abilitiesNB'ssearch-freestrategyofextrapolationfromlower-orderprobabili-ties.Itdoessobyaveragingtheestimatesofallofaclassofthird-orderesti-mators.ASuper-ParentOne-DependenceEstimator(SPODE)isa1-DBC(third-orderprobabilityestimator)thatrelaxestheassumptionofconditionalinde-pendencebymakingallotherattributesindependentgiventheclassandoneprivilegedattribute,thesuper-parent,x .Thisisaweakerconditionalinde-pendenceassumptionthanNB's,asitisnecessarilytrueifNB'sistrueandmayalsobetruewhenNB'sisnot.ItusesP(y;x)=P(y;x )P(xy;x )(6)togetherwithanindependenceassumptionthatentitlesP(xy;x )=aYi=1P(xiy;x )(7)Asthisisaweakerassumptionthanthatwhichentitles(4),thebiasofthemodelshouldbelowerthanthatofNB.However,itisderivedfromhigher-orderprobabilityestimatesandhenceitsvarianceshouldbehigher.5 ADOEexploitsthelowerbiasofSPODEswhileaddressingtheirhighervariancebyaveragingoverallestimatesofP(y;x)producedbyusingdi erentsuper-parents.AODEseekstouseP(y;x)=aX =1P(y;x )P(xy;x )=a:(8)However,inpracticeitisdesirabletoonlyuseestimatesofprobabilitiesforwhichrelevantexamplesoccurinthedata.Hence,AODEactuallyusesAODE(y;x)=8��&#x-3.3;〱&#x-3.3;〱:aX =1( )P(y;x )P(xy;x )=aX =1( ):aX =1( )&#x-3.3;〱0NB(y;x):otherwise(9)where( )is1ifattribute-valuex ispresentinthedata,otherwise0.Thatis,itaveragesoverallsuperparentswhosevalueoccursinthedata,anddefaultstoNBiftherearenosuchsuperparents.AsAODEusesallofaprede nedfamilyofestimators,eachofwhichex-trapolatesthedesiredhigh-orderprobabilityfromlower-orderprobabilities,itdoesnotperformsearch.Intermsoftheexampleattributespace,AODEextrapolatestoFigure1(a)fromthelower-orderstatisticsillustratedinFigure1(c-h)with(f)conditionedon(c)and(d),(g)conditionedon(c)and(e),and(g)conditionedon(d)and(e).AODEhasdemonstratedstrongpredictionaccuracy(bothzero-onelossandmean-squarederror)withrelativelymodestcomputationalrequirementsforlowdimensionaldata[2].Inconsequence,ithasenjoyedsubstantialuptake[2,3,5,6,7,8,9,10,12,11,14,16,18,19,21].2.2AnDEInthispaperwegeneralizetohigher-orderprobabilitiesthestrategyofsearch-freeextrapolationfromlower-orderprobabilities.Fornotationalconveniencewede nexi;j;:::q=hxixj;:::;xqi(10)Forexample,x2;3;5=hx2x3x5iAnDEaimstouseAnDE(y;x)=Xs2SnP(y;xs)P(xy;xs)=an(11)whereSnindicatesallsubsetsofsizenofthesetf1;:::ag6 However,inpracticewealsoneedtoavoidusingpairsofsuperparentswhosevaluesdonotoccurinthedata,andhenceuseAnDE(y;x)=8�&#x-3.3;〱:Xs2Sn(s)P(y;xs)P(xy;xs)=Xs2Sn(s):Xs2Sn(s)&#x-3.3;〱0A(n1)DE(y;x):otherwise(12)Attributesareassumedindependentgiventhesuperparentsandtheclass.Hence,P(xy;xs)isestimatedbyP(xy;xs)=aYi=1P(xiy;xs)(13)NotethatP(xiy;xs)=1whenxi2xs,andthatsmoothedestimatesshouldnotbeusedinthiscase.A0DE=NBandA1DE=AODE.Intermsofthesimpleattribute-spacedepictedinFigure1,A2DEextrapo-latesto(a)from(a)conditionedoneachof(f),(g)and(h),andA3DEmakesinferencesdirectlyfromtheclassdistributionin(a).Sa=ff1;:::aggandhencewhenn=axs=x.Therefore,theultimateexpressionofAnDE,AaDEseekstoclassifyusingAaDE(y;x)=P(y;x)P(xy;x)=aa(14)whereP(y;x)isestimateddirectlyfromD,cascadingtoeverlowerdependenceestimatorsshouldthecombinationofattribute-valuesnotbepresentinD.AsP(xy;x)andaabothequal1,itclassi esusingonlyitsdirectestimateofP(y;x).Observation1Theasymptoticclassi cationperformanceofAaDEequalsthatoftheBayesoptimalclassi er.Proof1AaDEclassi esusingargmaxy ^P(y;x)=Xz2Y^P(z;x)!whereeach^P()isdirectlyestimatedfromtheobserveddataandhenceap-proachesP()asthequantityofdataapproachesin nity.Hence,inthelimit,AaDEapproachesargmaxy P(y;x)=Xz2YP(z;x)!whichistheBayesoptimalclassi er. 7 However,assumingthereissucientdatatocomputethenecessarystatis-tics,andwewishtostorethenecessarystatisticsratherthancomputingthemasrequiredforclassi cation,thespacecomplexityofAaDEisO(kQai=1vi).Thisisbecausejointfrequenciesmustbestoredforeverycombinationofattributevalueandclassvalue.Exceptincasesoflowdimensionaldata,eventhecom-putationalrequirementsofA3DEdefeatourWekaimplementation,andhenceinthispaperwepresentonlyresultsforA2DEwithsomeillustrativeexamplesofA3DE.AnDEformsan(n+2)-dimensionalprobabilitytablecontainingtheobservedfrequencyforeachcombinationofn+1attributevaluesandtheclassvalues.ThespacecomplexityofthetableisOkan+1vn+1andthetimecomplexityofcompilingitisOtan+1,asweneedtoupdateeachentryforeverycombi-nationofthen+1attribute-valuesforeveryinstance.ThetimecomplexityofclassifyingasingleexampleisOkaanasweneedtoconsidereachattributeforeveryquali edcombinationofnparentattributeswithineachclass.Wedemonstratethatasnincreases,averagedn-dependenceestimatorsachievelowerbiasatthecostofhighervariance.Inconsequence,theidealorderofde-pendencewilldependonthedegreetowhichtheunderlyingprobabilitydistri-bution tstheassumptionsofthen-dependenceestimator,thequantityofdataavailabletoestimatethebaseprobabilities,andthecomputationaldemandsofaveragingoverhigher-orderestimators.2.3WeightedaveragingAODEanditsgeneralizationAnDEperformanunweightedaverageofthecom-ponentn-dependenceestimators.Ithasbeendemonstratedthatweightedav-eragingcanimproveupontheaccuracyofAODE'sestimates[27,28,29].TheempiricalevidencesuggeststhattheBayesianmodelaveragingofMaximumaPosterioriLinearMixtureofGenerativeDistributions(MAPLMG)isthemoste ectiveofcurrentapproaches[27,29].Itseemslikelythatsimilarapproacheswillbeequallye ectivewithn-dependenceestimators.ItisnotablethattheintroductionofBayesianmodelaveragingtotheAnDEframeworkintroducesbothsearchanddiscriminativelearning,asasearchisperformedforthesetofweightsthatoptimizetheposteriorprobabilitiesrelativetothetrainingdata.Doingsocanbeexpectedtoreducebiasatthecostofintroductionofvariance.Oneoftheinterestingquestionsthatthispaperinvestigatesistherela-tivepayo fortheinvestmentofadditionalcomputationineitherperformingBayesianmodelaveragingonAnDEorincreasingnandusingAn+1DE.Bothapproachescanbeexpectedtoreducebiasatthecostofanincreaseinbothvarianceandcomputation.Whichprovidesthemoree ectivetrade-o ?2.4TreeAugmentedNaiveBayesAnn-dependenceBayesianclassi er(n-DBC)[30]isaBayesiannetworkinwhicheachattributedependsupontheclassandatmostnothernon-class8 attributes.Ann-DBCuses(n+2)-orderprobabilities.Withinthisframework,NBisa0-DBC,AODEisa1-DBCandthefullBayesianclassi erisan(a1)-DBC.AnalternativetotheAnDEapproachtorelaxingNB'sindependenceas-sumptionistousesearchtoselectasinglemodelthataddsselectedinterde-pendenciesbetweenattributes.TreeAugmentedNaiveBayes(TAN)[31]isapopularapproachofthistype.Itusesconditionalmutualinformationtoselectabestsingleparentforeachattributes,inadditiontotheclass.Thus,itisa1-DBC.ItisinterestingtoconsiderhowsearchforasingleBayesianclassi ermodelcompareswithaveragingoveraclassofBayesianclassi ermodelsofthesamelevelofdependenceorofahigherlevelofdependence.Thispaperalsoinvesti-gatesthisissue.3RelationshiptoFeatingFeating[26]isagenericensemblelearningtechniquethatalsobuildsupontheensemblingstrategyofAODE.LikeAnDE,Featingoperatesbybuildingalocalmodelforeachcombinationofnattributevalues.Toclassifyanewinstance,Featingappliesallapplicablelocalmodelsandaggregatestheresultsbyper-formingamajorityvoteoftheresultingclassi cations.AnDEissimilartoFeatingNB.However,Featingaggregatesthepredictionsofitsbaselearnersbytakingthemodeoftheclasspredictions.Forprobabilisticclassi ers,thesecorrespondtothemaximumposteriorprobability.Incontrast,AnDEusestheensembletoestimatethejointprobability,P(x;y)foreachclass,andthencal-culatesitsestimateoftheposteriorprobabilityfromthisensembledestimateofthejointprobability.Agenericensemblingtechnique,suchasFeating,cannotworkbycalculatinganesembleestimateofthejointprobabilitiesbecausemanyclassi ersdonotproduceappropriateprobabilityestimates.DespitethecloserelationshiptoFeating,AnDEisworthyofstudyinitsownrightforthreereasons.First,irrespectiveofwhichaggregationmethodisused,couplingthesearch-lessensemblingstrategyembodiedbyFeatingwithsearch-lessbaselearnerNBcreatesalearnerthatcandeliverlowbiaswithoutsearch.Hence,AnDEpro-videsanexampleofanalternativetothetraditionalsearch-basedlearningparadigmwhichisabletodeliverlowbiasclassi ers.Second,asalreadynoted,AnDEutilizesadi erentaggregationmethodtoFeating.Itisinterestingtoexaminetheconsequencesofthesedi erences.CerquidesanddeMantaras[27]foundthatweightedensemblesofjointprobabil-ityestimatesachievedlowererrorthanweightedensemblesofposteriorprobabil-ityestimates,sothereissomeevidencethattheoutcomesmaybesubstantiallydi erent.Third,asthereisoverlapintheinformationrequiredbyeachofitslocalmodels,AnDEcanuseasinglecompiledmatrixofjointfrequencies,savingconsiderablespacerelativetostoringallofthelocalmodelsindependently.The9 spacecomplexityofanAnDEmodelisOkan+1vn+1whereasthespacecom-plexityofFeatingNBtolevelnisOk(an)anvn+1,whichis(n+1)timesthespacecomplexityofAnDE.MostbasemodelsformedbyFeatingwillnothavethisproperty,andhenceAnDEisanotablespecialcase.4EvaluationInthissection,weevaluatetheecacyofAnDE.Duetorelativelyhightimecomplexityofhigher-orderestimators,thehighestlevelofAnDEwithwhichweperformdetailedassessmentisA2DE.Theprimarymetricsweusearebias,variance,zero-onelossandRMSE.Toassesscomputationaloverheadsweusetotaltrainingandclassi cationtimesdividedbythenumberofexamples.We rststudytheperformanceofNB,AODEandA2DEtorevealhowperformancevariesasnincreaseswithintheAnDEframework.TANandMAPLMGarestudiedtoshowhowthesearch-freegenerativeAnDEstrategycompareswith,respectively,discriminativesearchforasingleBayesiannet-workclassi erofthesameorderofdependence,anddiscriminativesearchforaweightedclassi erofthenextlowerorderofdependence.WealsocompareA2DEwithvariantsthatcalculatethemeanposteriorprobability,ratherthanthemeanjointprobabilityofthesubmodels(PA2DE)andperformFeatingofNBbytakingthemodeoftheclasspredictionsofthesubmodels(FA2DE).Fi-nally,toexplorehowtheclassi cationperformanceofA2DEcomparestostate-of-the-artclassi ers,wealsostudyRandomForest[32]withtentrees(RF10)andRandomForestwith100trees(RF100).WecomparethesealgorithmsimplementedintheWekaworkbench[33]onthe62datasetsdescribedinTable1thathavebeenusedpreviouslyinrelatedresearch[2,34,35,36,37,38].Eachalgorithmistestedoneachdatasetusingtherepeatedcross-validationbias-varianceestimationmethod[39].Inordertomaximizethevariationinthetrainingdatafromtrialtotrial,weusetwo-foldcrossvalidation.Toobtainminimizethevarianceinourmeasurmentswereportaveragevaluesover50cross-validationtrials.WealsoformlearningcurvesforNB,AODE,A2DEandA3DEontheAdultdatasettofurtherinvestigatehowincreasingnwithintheAnDEframeworka ectsperformanceasthequantityofdataincreases.ThecurrentimplementationsofAODEandA2DEarelimitedtocategoricaldata.AnumberofapproacheshavebeendevelopedforextendingAODEtonu-mericdata[40].ThesecouldbegeneralizedtotheAnDEframework,buthowbesttodosoisamatterforfutureresearch.Hence,weassessonlytherelativecapacitiesofthesealgorithmswithrespecttocategoricaldata.Tothisend,allnumericattributesarediscretized.WhenMDLdiscretization[41],acommondiscretizationmethodforNB,isusedtodiscretizequantitativeattributeswithineachcross-validationfold,manyattributeshaveonlyonevalue.Intheseex-periments,wediscretizequantitativeattributesusingthree-binequal-frequencydiscretizationpriortoclassi cation.Thebaseprobabilitiesareestimatedusingm-estimation[42](m=1),asit10 Table1:Datasetsusedforexperiments No.DomainCaseAttClassNo.DomainCaseAttClass 1Abalone41779332LetterRecognition2000017262Adult4884215233LiverDisorders(Bupa)345723Annealing89839634LungCancer325734Audiology226702435Lymphography1481945AutoImports20526736MAGICGammaTelescope190201126BalanceScale6255337Mushrooms81242327BreastCancer(Wisconsin)69910238Nettalk(Phoneme)54388528CarEvaluation17288439New-Thyroid215639Census-Income(KDD)29928540240Nursery129609510Connect-4Opening6755743341OpticalDigits5620491011Contact-lenses245342PageBlocks547311512ContraceptiveMethodChoice147310343PenDigits10992171013Covertype58101255744PimaIndiansDiabetes7689214CreditScreening69016245PostoperativePatient909315CylinderBands54040246PrimaryTumor339182216Dermatology36635647PromoterGeneSequences10658217Echocardiogram1317248Segment231020718German100021249Sick-euthyroid377230219GlassIdenti cation21410350Sign125469320Haberman'sSurvival3064251SonarClassi cation20861221HeartDisease(Cleveland)30314252SPAME-mail460158222Hepatitis15520253Splice-junctionGeneSequences319062323HorseColic36822254Syncon60061624HouseVotes8443517255TeachingAssistantEvaluation1516325Hungarian29414256Tic-Tac-ToeEndgame95810226Hypothyroid(Garavan)377230457Vehicle84619427Ionosphere35135258Volcanoes15204428IrisClassi cation1505359Vowel990141129King-rook-vs-king-pawn319637260Waveform-5000500041330Labornegotiations5717261WineRecognition17814331LED100071062Zoo101177 11 oftenappearstoleadtomoreaccurateprobabilitiesthanLaplaceestimationforNBandAODE.Anexceptionisthatwealwaysuse1.0forP(xiy;xs)whenxi2xsTheaboveexperimentswereconductedonasingleCPUsinglecorevirtualLinuxmachinerunningonaDellPowerEdge1950withdualquadcoreIntelXeonE5410processorsrunningat2333Mhzwith32GBofRAM.Duetotech-nicalissuesincludingmemoryleaksintheWekaimplementationofRandomForest,itwasnotpossibletocompleteall50runsof2-foldcrossvalidationforRF10onCovertypeandRF100onCovertypeandCensus-Income(KDD).TheseexperimentswereinsteadcompletedonaLinuxClusterofXeon2.8GHzCPUs,anenvironmentthatdoesnotallowreliabletimemeasurementstobetaken.ForRF10andRF100onCovertype,computetimeswereestimatedbyaveragingoverthoserunsthatcouldbecompletedonthevirtualmachine.NorunscouldbecompletedonthevirtualmachineforRF100onCensus-Income(KDD)andsonotimeresultsarereported.Averagevaluesforeachcombinationofmetric,algorithmanddatasetareprovidedintheAppendix.Summaryresultsareprovidedinthetext.4.1VaryingnwithinAnDEWe rstconsidertherelativeperformanceofthethreevariantsofAnDE.Foreachperformancemeasure,thenumberofdatasetsforwhichA2DEhaslower,equalorhigheroutcomesrelativetoAODEandNBaresummarizedintowin/draw/lossrecords,andlikewiseforAODErelativetoNB.TheseresultsarepresentedinTable2.Asexpected,weseethatincreasingnfrom0(NB)to1(AODE)to2(A2DE)consistentlydecreasesbiasatthecostofanincreaseinvariance.Aswebelievethatdi erentbiasandvariancepro lessuitdi erentdataquantities[1],webelievethatthezero-onelossandRMSEresultstellusasmuchaboutthecompositionofthedatacollectionastheydoabouttheal-gorithms.Speci cally,wecontendthatwhetheronealgorithmoranotherwillwinonagivendatasetisdeterminedbyhowwellthetwoalgorithms'learningbiasesmatchtheunderlyingdistribution,bytheirvariance,andbythequantityofdata.Alowvariancealgorithmwillusuallyhaveanadvantageforsmalldatawhilealowbiasalgorithmwillusuallybeadvantagedbylargedata.Forourdatasets,bothAODEandA2DEreducebothzero-onelossandRMSEsigni -cantlyoftenrelativetoNB.WhileA2DEobtainslowerzero-onelossandRMSEthanAODEmoreoftenthanthereverse,thisdi erenceisnotsigni cant.Toinvestigateingreaterdetailourexpectationthatalgorithmswithlowervariancewillbeadvantagedforsmalldataandthosewithlowerbiasforlargerdata,weformlearningcurvesforAdult,replicatingthemethodofWebbetal.[2].1000objectsareselectedatrandomasatestsetandtrainingsetsweresampledfromtheremainingobjects.Thetrainingsetsizestartsfrom23andthendoublesupto47104,thisbeingaprogressionthatendswithasclosetoalltheavailabledataaspossibleoncethe1000testcasesareremoved.Thisprocessisrepeated50timesandeachalgorithmisevaluatedontheresultingtraining-testsetpairs.Thelearningcurvesofzero-onelossandRMSEforNB,12 Table2:Win/Draw/Loss:AnDE,n=0,1and2,onall62datasets A2DEvsAODEA2DEvsNBAODEvsNB W/D/LpW/D/LpW/D/Lp Bias47/0/150.00149/2/110.00148/0/140.001Variance19/1/420.00115/0/470.00120/1/410.005Zero-oneloss33/2/270.25942/1/190.00244/1/170.001RMSE35/1/260.15315/0/470.00149/1/120.001 AODE,A2DEandA3DEarepresentedinFigure2. 14 15 16 17 18 19 20 21 22 23 24 25 0 46 184 736 2944 11776 47104 Error%Training set sizeNB AODE A2DE A3DE 18 19 20 21 22 23 24 25 0 46 184 736 2944 11776 47104 RMSE%Training set sizeNB AODE A2DE A3DE Figure2:Zero-onelossandRMSEofNBandAnDEonAdultdataset,asfunctionoftrainingsetsizeTheplotsforzero-onelossclearlyshowthepredictedtrade-o forincreasingn.Atthesmallestdatasize,wherelowvarianceismoreimportantthanlowbias,zero-onelossisminimizedbyn=0(NB)andincreasesasnincreases.Atthelargestdatasize,wherelowbiasismostimportant,thisorderisreversed.AsimilartrendisshownwithrespecttoRMSEalthoughthealgorithmshavenotyetachievedtheirasymptoticratesatthelargestdatasizesavailable.Itisinterestingtoseehowtherelativebias/variancetrade-o sofincreas-ingnplayo whenNB'sattributeindependenceassumptionholds.TheLED13 Figure3:Averageper-exampletrainingandclassi cationtimesforNB,AODEandA2DE 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 NB AODE A2DE average time (seconds) Training Time 0 0.01 0.02 0.03 0.04 0.05 NB AODE A2DE average time (seconds) Classification Time datasethasaspeci ccon gurationofattribute-valuesforeachclass,makingtheattributesconditionallyindependentgiventheclass.Eachattributehas10%noiseadded.AODEandA2DEareabletoover tthisnoise,leadingtoincreasederror.NB's0-1lossis0.2627,AODE'sis0.2639andA2DE'sis0.2667.Theseoutcomesarewithtrainingsetsizesof500.UsingtheUCIdatagenerator,wegenerated10LEDdatasetscomprising2,000and10comprising4,000instancesandrepeatedthecross-validationexperimentsthereon.Forthedatasetsof2,000,thetrainingsetsizeis1,000andthemeanandstandarddeviationoftherespective0-1lossisNB:0.26030.0099,AODE:0.26010.0101andA2DE0.26030.0102.Forthedatasetsof4,000,thetrainingsetsizeis2,000andthemeanandstandarddeviationoftherespective0-1lossisNB:0.25970.0049,AODE:0.25980.0051andA2DE:0.26030.0053.Itseemsclearthatincreas-ingtrainingsetsizesrapidlyreducethemagnitudeoftheerroradvantagethatNBenjoysinthiscontextwhereitsconditionalattributeassumptionissatis ed.As nalcon rmationthathighernisbestsuitedtolargerdata,onthetenlargestdatasets,thosewithmorethan8,000examples,A2DEconsistentlyachieveslower0-1lossandRMSEthanAODE(p=0001).Asexpected,bothtrainingandclassi cationcomputetimesincreaseasnincreases.Figure3showsthegrandaveragesfortheper-exampletrainingandclassi cationtimesforeachalgorithm.4.2ComparisonwithTANWewishtoexploretherelativebene tsofdiscriminativesearchforasinglebestBayesianclassi ermodelagainstAnDE'ssearch-freeapproachofaveragingoveraclassofBayesianclassi ermodels.TothisendwecompareAODEandA2DEwithTAN.Table3presentswin/draw/lossresultscomparingAODEandA2DEtoTAN.TANhasanadvantageinbiasbutadisadvantageisvariancerelativetoAODE.Whenusingsearchtoselectasingle1-DBCmodeliscomparedtoav-eragingoveraclassof2-DBCs,thebiasadvantageislostbutthevariance14 Table3:Win/Draw/Loss:AnDE,n=1and2vsTANonall62datasets A2DEvsTANAODEvsTAN W/D/LpW/D/Lp Bias34/0/280.26320/1/410.005Variance48/0/140.00152/1/90.001Zero-oneloss48/0/140.00143/1/180.001RMSE43/1/180.00140/1/210.010 Figure4:Averageper-exampletrainingandclassi cationtimesforAODE,A2DEandTANTwovaluesareshownforeachalgorithm,theaverageacrossalldatasetsandtheaverageacrossthetenlowest-dimensional(4{7attributes)datasets. 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 AODE A2DE TAN average time (seconds) Training Time all datasets low-dimensional 0 0.01 0.02 0.03 0.04 0.05 AODE A2DE TAN average time (seconds) Classification Time all datasets low-dimensional disadvantageremains.Therelativebias-variancetradeo sofAODEandTANresultinageneralerroradvantagetoAODE.WhenTANlosesitsbiasadvantagebymovingfromAODEtoA2DE,theerroradvantageoftheAnDEclassi erbecomesevenmoreconsistent.Figure4showstherelativetrainingandclassi cationtimesforAODE,A2DEandTAN.ItisclearthatA2DEhasaconsiderablygreatercomputationalre-quirementsbothfortrainingandclassi cation.However,thisdisadvantagedisappearswhenweconsideronlythetenlowestdimensionaldatasets,alsoil-lustratedinthis gure.4.3ComparisonwithMAPLMGAsdiscussedabove,wewishtoinvestigatetherelativepayo sobtainedbyinvestingadditionalcomputationtothatrequiredbyAODEbyrespectivelyusingdiscriminativelearningofweightsor,alternatively,increasingtheorderoftheprobabilitiesfromwhichtheposteriorprobabilityisextrapolated.Tothisend,Table4presentswin/draw/lossresultscomparingA2DEandAODEtoMAPLMG.Asestablishedbypreviousresearch[27],MAPLMG'sapproachofusingdis-15 Table4:Win/Draw/Loss:AnDE,n=1and2vsMAPLMGonall62datasets A2DEvsMAPLMGAODEvsMAPLMG W/D/LpW/D/Lp Bias40/0/220.01517/4/410.001Variance19/1/420.00236/5/210.031Zero-oneloss30/1/310.50022/4/360.043RMSE34/1/280.26319/0/390.006 Figure5:Averageper-exampletrainingandclassi cationtimesforAODE,A2DEandMAPLMGInadditiontothetimesforalldatasets,trainingtimesareshownforthetenlargestdatasets. 0 0.02 0.04 0.06 0.08 0.1 0.12 AODE A2DE MAPLMG average time (seconds) Training Time all datasets largest 0 0.01 0.02 0.03 0.04 0.05 0.06 AODE A2DE MAPLMG average time (seconds) Classification Time criminativelearningofweightsfortheAODElinearcombinationsigni cantlyreducesbiasrelativetoAODEatthecostofanincreaseinvariance.However,relativetothisdiscriminativeapproachtoextrapolatingfromthirdorderprob-abilities,A2DE'ssearchlessapproachtoextrapolatingfromfourthorderprob-abilitiesfurtherreducesbiasatthecostofanincreaseinvariance.Whiletheresultingdi erenceinerrorisnotsigni cantacrossthefullsuiteof62datasets,whenthetenlargestdatasetsareconsidered,thelowerbiasalgorithm,A2DE,consistentlyachieveslower0-1lossandRMSEthanMAPLMG(p=0001).MAPLMG'sBayesianmodelaveragingcomesatconsiderablecostintrain-ingtime.Figure5showstheaverageper-exampletrainingandtesttimesforAODE,A2DEandMAPLMG.NotethatMAPLMGisimplementedasanex-ternalfunctiontoWeka,andhenceislikelytobeinherentlymoreecient.Thetrainingandtesttimesincludeasubstantial xedoverhead,andhencetheper-instancetrainingtimesshoulddecreaseifthecomplexityislinearwithrespecttothetrainingsetsize.However,MAPLMG'ssuper-lineartrainingcomplexityminimizesthise ect,demonstratingthatitwillnotbefeasibletoapplyittoverylargedata.16 Table5:Win/Draw/Loss:A2DEvsPA2DEandFA2DE A2DEvsPA2DEA2DEvsFA2DE W/D/LpW/D/Lp Bias41/2/190.00146/1/150.001Variance21/3/380.01822/1/390.020Zero-oneloss33/1/280.30436/2/240.078RMSE30/0/320.44949/0/130.001 4.4ComparisonwithFeatingTounderstandhowtheAnDEapproachperformsrelativetoFeatingNB,wecompareA2DEwithavariantthatcalculatesthemeanoftheposteriorproba-bilities,ratherthanthemeanofthejointprobabilities(PA2DE)andonethatcalculatesthemodeoftheclassespredictedbythesubmodels(FA2DE).AsdescribedinSection3andthestartofSection4,theseembodythetwomaindi erencesbetweenAnDEandFeatingNB.Table5showsthewin/draw/lossresultscomparingA2DEtothesevariants.ItisclearthatbothvariantshavehigherbiasbutlowervariancethanA2DE.ItisstraightforwardtounderstandwhyFeatingwouldhavelowervariance.Themodeisamuchmorestableestimatorofcentraltendencythanthemean,whichcanbegreatlyin\ruencedbyasingleoutlier.Itislessobviouswhylowervarianceshouldresultfromaveragingovertheestimatesoftheposteriorratherthanofthejointprobability.Nonetheless,theresultisconsistentwithCerquidesanddeMantaras'[27] ndingthatalinearcombinationofjointprobabilityestimatesresultedinhigheraccuracythanalinearcombinationofposteriorprobabilityes-timates.Thisremainsaninterestingunexplainedphenomenaworthyoffurtherinvestigation.Overthefullrangeofdatasetsthesedi erencesinbiasandvariancepro lesdonotresultinsigni cantdi erencesoneithermeasureoferror,exceptwithrespecttoRMSEforFeating.Thisre\rectsthemannerinwhichFeatingselectsasingleclassratherthanproducingadistributionofclassprobabilities.Duetoitslowerbias,A2DEachieveslower0-1lossthanPA2DEandFA2DEonthetenlargestdatasets(WDLvsPA2DE=8/0/2,p=0054;WDLvsFA2DE=8/1/1,p=0019).Thisoutcomeisstatisticallysigni cantatthe0.05levelwithrespecttoFA2DE,butnarrowlymissesoutonbeingsigni cantwithrespecttoPA2DE.A2DEachieveslowerRMSEthanboththealternativesonnineofthetenlargestdatasets,anddrawsontheremainingdataset,p=0002.Hence,theevidenceisstronglysuggestivethattheAnDEapproachispreferableforlargedata.17 Table6:Win/Draw/Loss:AnDE,n=0,1and2,vsRF10andRF100onall62datasets AnDEvsRF10AnDEvsRF100 W/D/LpW/D/Lp A2DEBias18/1/430.00122/2/380.026Variance57/0/50.00145/1/160.001Zero-oneloss42/0/200.00436/3/230.059RMSE40/0/220.01535/0/270.187 AODEBias16/0/460.00120/0/420.004Variance57/1/40.00147/0/150.001Zero-oneloss41/0/210.00833/1/280.304RMSE39/0/230.02834/0/280.263 NBBias14/1/470.00116/1/450.001Variance56/0/60.00151/0/110.001Zero-oneloss33/0/290.35230/1/310.500RMSE30/0/320.45028/0/340.263 4.5Comparisonwiththestate-of-the-artInadditiontotherelativeperformanceoftheserelatedalgorithms,itisusefultounderstandhowtheperformancecomparestowellknownexamplesofthestate-of-the-art.WechooseRandomForests[32]asthecomparatoralgorithmbecauseitisrelativelyunparameterizedandhencereadilyproducesclearlyunderstoodperformanceoutcomes.WeuseRandomForestswithboththedefaultsettingof10trees(RF10)andwith100trees(RF100),allowingustoexploretherelativecomputational/accuracytrade-o s.Table6showsthewin/draw/lossresultsforeachofA2DE,AODEandNBagainstRF10andRF100foreachofzero-oneloss,Bias,VarianceandRMSE.AllthreelevelsofAnDEhavehigherbiasbutlowervariancethanbothlevelsofRandomForest.Thistrade-o deliverslowererrorsigni cantlymoreoftenthannotforbothA2DEandAODErelativetoRF10.BothdeliverlowererrormoreoftenthanRF100,butnotsigni cantlyso.Notably,NBachieveshighererroralmostasoftenaslowerrelativetobothRF10andRF100.Thisillustratestheweaknessesofsuch`bake-o s'withrespecttoerror.Aswehavearguedabove,lowvariancealgorithmssuchasNBwillbeadvantagedbytherelativelysmalldatasetsusedinthisstudy.Toassessthise ect,werepeatedtheerrorcomparisonsusingonlythetenlargestdatasets,thosecontainingmorethan8000examples.TheresultsareshowninTable7.Fortheselargerdatasets,bothR10andRF100achievelowererrormoreoftenthanallthreeofA2DE,AODEandNB,signi cantlysowithrespecttoAODEandNBandwhencomparingRF100toA2DEon0-1loss.18 Table7:Win/Draw/Loss:AnDE,n=0,1and2,vsRF10andRF100onlargedatasets AnDEvsRF10AnDEvsRF100 W/D/LpW/D/Lp A2DEZero-oneloss2/0/80.0551/1/80.020RMSE3/0/70.1722/0/80.055 AODEZero-oneloss0/0/100.0010/0/100.001RMSE1/0/90.0110/0/100.001 NBZero-oneloss0/0/100.0010/0/100.001RMSE0/0/100.0010/0/100.001 However,RandomForest'serroradvantageforlargedatacomesatacostintrainingtime.Figure6showsthetrainingandclassi cationtimesforAODE,A2DE,RF10andRF100.Itisapparentthat,overall,RF100hasveryhightrainingtimes.WhileA2DE'strainingtimedoesapproachRF100'sforhighdimensionaldata,forsmalldataandlowdimensionaldataitstrainingtimesarecompetitivewithRF10.Ontheotherhand,A2DErequiressubstantiallymoreclassi cationtimeonaveragethanrandomforest.Thisrequirementgrowsgreatlywithhigh-dimensionaldata.A2DEwillnotbefeasibleforclassi cationoflargenumbersofhigh-dimensionalobjects.Incontrast,itsclassi cationtimeisverycompetitivewithlow-dimensionaldata.5ConclusionsandDirectionsforFutureRe-searchAnDEprovidesanattractiveframeworkfordevelopingmachinelearningtech-niques.Asingleparameterncontrolsabias-variancetrade-o suchthatn=aprovidesaclassi erwhoseasymptoticerroristheBayesoptimalerrorrate.However,forhigh-dimensionaldataonlyverylow-orderformsofAnDEarefea-sible.Nonetheless,wehaveestablishedthathigher-ordervariantsarelikelytodelivergreateraccuracythanlower-orderalternativeswhenthenumberoftrain-ingexamplesishigh.Inconsequence,apromisingdirectionforfutureresearchistodevelopcomputationallyecienttechniquesforapproximatingAnDEforhighvaluesofnAfurtherunresolvedissueishowtoselectanappropriatevalueofnforanyspeci cdatasetD.Aretheremorecomputationallyecientapproachesthanasimplewrapper-basedcomparisonofeachpossiblevalue?AnumberoftechniqueshavebeendevelopedforextendingAODEtohandlenumericdata[40].ThereisaneedtoextendthisworktothemoregeneralAnDEframework.19 Figure6:Averageper-exampletrainingandclassi cationtimesforAODE,A2DE,RF10andRF100Trainingtimesarepresentedforall,thetenlargest(excludingCensusIncome,forwhichRF100couldnotbeexecutedonamachineforwhichreliabletimescouldbeobtained,thus5,620{581,012examples),thetenlowestdimensional(5{7attributes)andthetenhighestdimensional(43{70attributes)datasets.Classi cationtimesarepresentedforall,thetenlowestdimensionalandthetenhighestdimensionaldatasets. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 AODE A2DE RF10 RF100 average time (seconds) Training Time all datasets largest low-dimensional high-dimensional 0 0.05 0.1 0.15 0.2 0.25 0.3 AODE A2DE RF10 RF100 average time (seconds) Classification Time all datasets high-dimensional low-dimensional Wehavepresentedastrategyforlearningwithoutsearch.Wedonotar-gue,however,thatsearchshouldnecessarilybeavoided.Indeed,ithasbeendemonstratedthatbothappropriatefeatureselection[43,44]andweightingofthesubmodels[27,28,29]canreducetheerrorofAODE.Therefore,itislikelytobeworthwhiletoexploreecientmethodsforeachofthesestrategiesforhighervaluesofn.Iffastclassi cationisrequired,andtimefortrainingislessconstrained,approachesthatusesearchtoselectasmallnumberofsubmodelsfromanAnDEmodelarelikelytobedesirable.Wherethereissucienttrain-ingtimeavailable,searchforappropriatesubmodelweightsisalsolikelytobeuseful.Wehavedevelopedagenerativelearningalgorithmthatgeneralizestheprin-ciplesthatunderlieAODEtoeverhigherlevelsofdependence.Ithasthefol-lowingdesirablefeatures:computationalcomplexityislinearwithrespecttothenumberoftrainingexamples;directpredictionofclassprobabilities;integratedhandlingofmissingvalues;robustnessinthefaceofnoise;otherthanthechoiceofwhichinstantiation(choiceofn)andchoiceofsmoothingtechnique,theapproachusesnotuneableparametersthereisnomodelselection;20 asimplemechanismcontrolsthebias/variancetrade-o ;incrementallearning;learningandclassi cationcanreadilyutilizeparallelcomputation;andthereisadirecttheoreticalbasisthatprovidesoptimalpredictionexceptinsofarasclearlyspeci edassumptionsareviolated.Asingleparameternprovidescontroloverabias-variancetrade-o ,suchthathighervaluesofnareappropriateforgreaternumbersoftrainingcases.AnDEdemonstratesthatitispossibletodevelopcompetitivelearnerswithoutusingsearch.Offurtherinterest,thisfamilyofalgorithmsshowthatitispossibletodeveloplowbiasalgorithmsinagenerativeframework.Finally,A2DEprovestobeacomputationallytractableversionofAnDEthatdeliversstrongclassi- cationaccuracyforlargedatawithoutanyparametertuning.AcknowledgementsThisresearchhasbeensupportedbyAustralianResearchCouncilgrantDP110101427.WearegratefultoJoaoGamaandKevinKorbforinsightfuldiscussionsonthisresearchandfeedbackondraftsofthispaperADetailedResultsDetailedresultsforBias,Variance,0-1Loss,RMSE,TrainingTimeandClas-si cationTimesarepresentedinTables8to13.Thedatasetsarelistedinascendingorderonnumberofinstances.References[1]Brain,D.,Webb,G.I.:Theneedforlowbiasalgorithmsinclassi cationlearningfromlargedatasets.In:ProceedingsoftheSixthEuropeanCon-ferenceonPrinciplesofDataMiningandKnowledgeDiscovery(PKDD2002),Berlin,Springer-Verlag(2002)62{73[2]Webb,G.I.,Boughton,J.,Wang,Z.:NotsonaiveBayes:Aggregatingone-dependenceestimators.MachineLearning58(1)(2005)5{24[3]Nikora,A.P.:Classifyingrequirements:Towardsamorerigorousanalysisofnatural-languagespeci cations.In:ProceedingsoftheSixteenthIEEEInternationalSymposiumonSoftwareReliabilityEngineering,Washington,DC,USA,IEEEComputerSociety(2005)291{300[4]Camporelli,M.:UsingaBayesianclassi erforprobabilityestimation:Analysisoftheamisscoreforriskstrati cationinmyocardialinfarction.DiplomaThesis,DepartmentofInformatics,UniversityofZurich(2006)21 Table8:BiasDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.18970.19700.19670.20180.20020.32450.19130.17100.1728LungCancer0.34130.34230.34430.37020.37260.31740.34280.34540.3739Labornegotiations0.05550.06080.06500.06620.06540.08070.06270.10090.1011PostoperativePatient0.27300.28930.29490.29410.29230.27170.28780.28090.2907Zoo0.03820.03080.03110.03170.03150.05180.03080.03020.0329PromoterGeneSequences0.06920.09020.22990.16240.17370.08080.09340.09010.0576Echocardiogram0.26530.26780.26280.26130.26370.24160.26160.25000.2543Lymphography0.12860.12020.11640.11530.11810.12560.11850.13070.1373IrisClassification0.03080.02860.02960.03070.02930.02700.02880.03060.0308TeachingAssistantEvaluation0.35670.30750.29970.29990.30660.32800.31090.29690.3001Hepatitis0.13680.12620.12300.12760.12670.10080.12440.13070.1327WineRecognition0.04300.03870.04050.03910.04130.04060.03880.04340.0435AutoImports0.26150.17750.16800.17240.17400.19960.17680.13800.1362SonarClassification0.22270.16100.14220.15010.15080.14390.15930.12210.1344GlassIdentification0.21330.21240.21100.20750.21050.21420.21260.20410.2074New-Thyroid0.09970.09440.09600.09330.09270.10260.09470.09810.0998Audiology0.21930.21960.21930.21790.21790.18170.21960.15680.1712Hungarian0.12320.12250.12320.12400.12370.12620.12190.14580.1496HeartDisease(Cleveland)0.14900.14700.14540.14830.14830.13970.14680.15140.1530Haberman'sSurvival0.24160.24080.24130.24250.24200.24060.24180.23420.2369PrimaryTumor0.38110.38190.38100.38270.38150.37640.38330.37190.3824LiverDisorders(Bupa)0.32690.32090.32040.32130.32010.31370.32020.32030.3242Ionosphere0.17580.06880.06150.06330.06360.04760.06720.05470.0585Dermatology0.01070.01200.01230.01130.01160.01300.01190.02190.0195HorseColic0.17350.15270.14570.14460.14480.14120.15160.12830.1343HouseVotes840.09170.04710.03670.04360.04080.04500.04150.02990.0339CylinderBands0.15480.13170.13150.13780.13740.33260.12930.18620.2007Syncon0.10180.05360.04360.07280.07400.03000.04860.03840.0382BalanceScale0.13810.14530.15050.15030.15080.14320.14540.14290.1436CreditScreening0.12660.11970.11910.12040.12090.11810.11880.11370.1165BreastCancer(Wisconsin)0.02550.02760.02630.02610.02550.02690.02720.03070.0314PimaIndiansDiabetes0.24990.24060.22760.22590.22950.21890.23930.21070.2122Vehicle0.38490.29480.28210.28760.28540.26780.29010.24040.2458Annealing0.09080.06760.06710.06470.06540.06250.06730.09690.0986Tic-Tac-ToeEndgame0.25090.20100.07190.10030.11140.16930.19610.04000.0275Vowel0.33080.13700.11940.12000.12440.16740.11890.10540.1056LED0.22610.22680.22700.22720.22660.22410.22690.22610.2277German0.21000.20100.19700.20230.20350.18400.20120.18260.1977ContraceptiveMethodChoice0.46830.43390.40740.41680.41050.38300.42470.36820.3754Volcanoes0.50340.46300.45770.45770.45830.45570.46560.45390.4577CarEvaluation0.10610.05910.04110.03790.04170.04820.05200.03950.0388Segment0.16050.12110.11620.11770.11780.11040.11560.09250.0933Splice-junctionGeneSequences0.03830.03040.02660.05810.05690.03760.03020.03550.0259King-rook-vs-king-pawn0.10740.07160.05410.07220.08580.06260.04580.00790.0062Hypothyroid(Garavan)0.07430.07380.07370.07400.07420.07460.07390.07390.0741Sick-euthyroid0.06630.06240.06130.06180.06230.05890.06030.05900.0590Abalone0.47370.39250.39020.38600.39030.39070.39210.37450.3769SPAME-mail0.33220.33240.33230.33220.33220.33100.33240.33080.3312Waveform-50000.20700.16920.16250.18540.18790.15690.16290.13120.1531Nettalk(Phoneme)0.19570.14760.15120.13220.13730.18780.13610.10630.1098PageBlocks0.07210.07710.07740.07670.07810.07950.07760.07750.0776OpticalDigits0.08040.02860.02290.04100.04210.03820.02790.02430.0221Mushrooms0.02050.00020.00000.00000.00000.00020.00020.00000.0000PenDigits0.15620.06010.04740.05160.05080.07700.05310.03300.0341Sign0.45300.40010.36610.36850.36780.38590.38330.33780.3391Nursery0.09010.06550.04330.04540.04270.05460.06680.01050.0084MAGICGammaTelescope0.23870.21710.19920.20730.20570.21350.21660.20450.2049LetterRecognition0.44500.37400.34730.35680.36350.36680.36970.25640.2591Adult0.17770.16680.15790.16520.16610.15520.16000.13970.1416Connect-4Opening0.26460.22500.21070.24210.25840.22240.22030.13040.1427Census-Income(KDD)0.23580.08530.06040.13180.12240.06280.08060.04290.0447Covertype0.34440.31800.30750.33350.34440.32330.31140.23630.236922 Table9:VarianceDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.14190.14720.16750.14150.20980.09720.13790.18730.1647LungCancer0.17940.19210.19630.19670.19670.21570.19160.25960.2218Labornegotiations0.03930.04830.04700.04150.04330.09020.04680.10010.0827PostoperativePatient0.08590.07820.08760.08320.08220.10270.08100.10910.0961Zoo0.03430.03080.03020.03020.03240.04720.03080.04780.0386PromoterGeneSequences0.06930.11190.11800.09050.08970.15620.11400.16610.0909Echocardiogram0.06360.06970.08230.08580.08570.08680.07420.11090.1043Lymphography0.04100.04440.04790.04210.04430.10410.04570.08980.0622IrisClassification0.01120.01100.01190.01310.01520.01710.01160.01260.0127TeachingAssistantEvaluation0.16490.19580.19450.19800.19110.19980.19650.21510.2108Hepatitis0.02370.03270.03740.03330.03380.06220.03340.06050.0464WineRecognition0.01740.01830.02070.01990.01940.03640.01840.03890.0203AutoImports0.13710.13750.13780.13250.13110.12730.13750.15500.1289SonarClassification0.07560.08830.07810.08430.08470.11970.08650.13800.0904GlassIdentification0.04840.04000.04490.04660.04550.05650.04280.06180.0616New-Thyroid0.01920.02580.02690.02470.02420.03380.02560.02980.0287Audiology0.10140.09950.09860.09900.10080.15880.09970.16190.1104Hungarian0.02490.02910.03810.03700.03630.05960.03130.07100.0568HeartDisease(Cleveland)0.02090.02990.03950.03480.03430.06350.03160.08670.0708Haberman'sSurvival0.04110.05690.06390.06460.05950.06450.05050.08530.0808PrimaryTumor0.15940.15820.16070.15760.15890.20010.15800.23610.2029LiverDisorders(Bupa)0.09680.10750.11000.11110.11330.11860.10860.11830.1155Ionosphere0.03260.01170.01680.00960.00970.01980.01310.03780.0177Dermatology0.00950.01110.01120.01060.01020.02880.01110.03640.0179HorseColic0.02450.03690.04360.03350.03320.05870.03820.04990.0313HouseVotes840.00610.00960.01370.00980.01130.02040.01120.01730.0084CylinderBands0.10650.11120.10730.09540.09600.07030.11000.12720.0910Syncon0.00920.01990.01960.01660.01630.02590.01970.05820.0254BalanceScale0.05550.06060.07120.07240.07180.06720.06220.08280.0808CreditScreening0.02560.02700.02810.02780.02720.04510.02750.05780.0441BreastCancer(Wisconsin)0.00160.00950.01270.00760.00730.02410.01050.01380.0076PimaIndiansDiabetes0.02540.03490.05080.05130.04780.05760.03690.07720.0729Vehicle0.08570.09860.10490.09970.10230.12330.10240.14610.1305Annealing0.01170.01740.01690.01820.01790.02060.01620.04520.0383Tic-Tac-ToeEndgame0.04070.05080.05360.05620.05990.07870.05410.08090.0490Vowel0.22680.16370.14510.13980.14440.21530.15280.14240.1183LED0.03670.03710.03970.03980.03920.04650.03760.05510.0509German0.04810.05400.06020.05190.05070.09690.05460.09930.0689ContraceptiveMethodChoice0.07190.10360.13150.11900.12580.15920.11320.19820.1875Volcanoes0.08290.13330.14150.14150.14130.14130.13050.14690.1427CarEvaluation0.05060.04720.04360.04050.04520.03350.04560.05170.0389Segment0.01980.01840.01660.01840.01890.02170.01780.02690.0245Splice-junctionGeneSequences0.00830.00970.01260.01750.01720.02320.00980.08240.0222King-rook-vs-king-pawn0.01990.02040.02000.01790.01880.01550.01280.01120.0067Hypothyroid(Garavan)0.00080.00060.00070.00060.00070.00160.00050.00760.0063Sick-euthyroid0.01450.00340.00280.00290.00330.00530.00330.00850.0073Abalone0.02340.08370.08660.09170.08590.08640.08390.10520.1032SPAME-mail0.00510.00500.00500.00500.00510.00740.00500.00670.0061Waveform-50000.01320.02920.03300.01660.01610.05870.03280.10710.0475Nettalk(Phoneme)0.10760.11670.12950.11320.12460.16020.09850.08650.0686PageBlocks0.01770.00760.00630.00700.00570.00430.00610.00510.0049OpticalDigits0.00990.00860.00760.00900.00910.01420.00890.04190.0118Mushrooms0.00290.00010.00000.00010.00010.00010.00010.00010.0000PenDigits0.00790.00640.00590.00540.00640.01700.00680.01860.0131Sign0.01130.01000.01600.01230.01890.01360.01740.01930.0176Nursery0.00770.00880.01130.01040.01170.01460.01230.02310.0160MAGICGammaTelescope0.00950.00950.01820.00800.01170.00790.00710.00980.0089LetterRecognition0.05200.04780.04730.04390.04690.07680.04770.10000.0917Adult0.00510.00850.01050.00790.00710.01400.00970.03700.0331Connect-4Opening0.01460.01890.02120.01460.01450.01440.01960.08310.0449Census-Income(KDD)0.00330.01210.01100.00550.00640.00770.01300.01390.0102Covertype0.00290.00280.00380.00310.00240.00350.00450.01490.014123 Table10:0-1LossDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.33170.34420.36420.34330.41000.42170.32920.35830.3375LungCancer0.52060.53440.54060.56690.56940.53310.53440.60500.5956Labornegotiations0.09470.10910.11190.10770.10880.17090.10950.20110.1839PostoperativePatient0.35890.36760.38240.37730.37440.37440.36890.39000.3869Zoo0.07250.06160.06140.06200.06400.09900.06160.07800.0715PromoterGeneSequences0.13850.20210.34790.25280.26340.23700.20740.25620.1485Echocardiogram0.32890.33760.34500.34720.34950.32840.33590.36090.3586Lymphography0.16960.16460.16430.15740.16240.22970.16420.22040.1995IrisClassification0.04200.03960.04150.04370.04450.04410.04040.04320.0435TeachingAssistantEvaluation0.52160.50330.49420.49790.49770.52780.50740.51190.5109Hepatitis0.16050.15880.16040.16090.16050.16300.15780.19120.1791WineRecognition0.06040.05700.06120.05900.06080.07700.05720.08240.0638AutoImports0.39860.31490.30590.30490.30510.32680.31430.29310.2651SonarClassification0.29830.24930.22040.23440.23550.26370.24580.26010.2248GlassIdentification0.26170.25240.25590.25410.25600.27070.25540.26590.2690New-Thyroid0.11900.12020.12290.11800.11690.13650.12040.12790.1285Audiology0.32060.31910.31790.31690.31880.34040.31930.31880.2816Hungarian0.14820.15160.16140.16100.15990.18580.15320.21670.2064HeartDisease(Cleveland)0.16990.17690.18490.18320.18260.20320.17840.23810.2238Haberman'sSurvival0.28270.29770.30520.30710.30140.30510.29230.31950.3178PrimaryTumor0.54050.54010.54170.54030.54050.57650.54120.60790.5853LiverDisorders(Bupa)0.42370.42830.43050.43230.43340.43230.42880.43860.4397Ionosphere0.20840.08060.07820.07290.07330.06740.08030.09250.0762Dermatology0.02030.02310.02350.02190.02180.04180.02300.05840.0374HorseColic0.19790.18970.18930.17810.17800.19980.18980.17820.1656HouseVotes840.09780.05670.05030.05340.05210.06540.05270.04710.0423CylinderBands0.26130.24300.23880.23310.23340.40290.23920.31340.2917Syncon0.11100.07350.06330.08940.09040.05590.06840.09660.0636BalanceScale0.19360.20590.22170.22280.22260.21040.20760.22580.2244CreditScreening0.15230.14670.14730.14820.14820.16320.14630.17140.1606BreastCancer(Wisconsin)0.02710.03710.03900.03370.03290.05090.03770.04450.0390PimaIndiansDiabetes0.27520.27540.27840.27720.27720.27650.27630.28790.2851Vehicle0.47070.39330.38700.38730.38780.39110.39240.38650.3764Annealing0.10250.08500.08400.08290.08330.08310.08340.14200.1369Tic-Tac-ToeEndgame0.29160.25190.12550.15660.17130.24800.25020.12090.0766Vowel0.55770.30070.26450.25980.26870.38270.27170.24770.2239LED0.26270.26390.26670.26700.26590.27060.26440.28110.2786German0.25810.25500.25730.25420.25420.28090.25590.28190.2666ContraceptiveMethodChoice0.54020.53750.53890.53580.53630.54220.53790.56640.5630Volcanoes0.58630.59630.59920.59920.59950.59700.59620.60070.6004CarEvaluation0.15670.10630.08460.07850.08690.08180.09760.09120.0776Segment0.18030.13940.13280.13610.13670.13210.13340.11940.1178Splice-junctionGeneSequences0.04660.04010.03920.07550.07410.06080.04000.11790.0480King-rook-vs-king-pawn0.12730.09200.07410.09010.10460.07800.05860.01910.0129Hypothyroid(Garavan)0.07510.07440.07440.07460.07490.07610.07450.08150.0804Sick-euthyroid0.08080.06570.06410.06470.06560.06430.06360.06750.0664Abalone0.49710.47620.47690.47770.47620.47700.47600.47960.4801SPAME-mail0.33730.33730.33730.33720.33730.33840.33730.33750.3373Waveform-50000.22030.19840.19550.20200.20400.21550.19570.23830.2006Nettalk(Phoneme)0.30320.26430.28070.24540.26180.34800.23460.19270.1784PageBlocks0.08980.08470.08360.08370.08370.08390.08380.08260.0825OpticalDigits0.09030.03720.03050.05000.05120.05240.03670.06620.0339Mushrooms0.02340.00030.00000.00010.00010.00030.00030.00010.0000PenDigits0.16410.06650.05340.05700.05720.09400.05990.05170.0472Sign0.46430.41010.38210.38080.38670.39950.40060.35710.3567Nursery0.09790.07430.05460.05580.05440.06920.07910.03360.0244MAGICGammaTelescope0.24820.22650.21740.21530.21740.22150.22370.21430.2139LetterRecognition0.49690.42170.39460.40070.41040.44360.41730.35640.3508Adult0.18280.17530.16840.17310.17320.16920.16980.17670.1747Connect-4Opening0.27920.24400.23190.25670.27290.23680.24000.21350.1876Census-Income(KDD)0.23910.09740.07140.13730.12890.07060.09360.05690.0550Covertype0.34730.32080.31130.33660.34680.32680.31590.25120.251024 Table11:RMSEDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.37820.39070.39720.39240.41090.44620.38370.40980.3941LungCancer0.56230.57050.57360.49660.49910.51690.57060.48750.4736Labornegotiations0.26840.28200.28400.29240.28570.35160.28220.37080.3592PostoperativePatient0.42050.42660.43550.42740.45100.42130.42740.44480.4377Zoo0.12450.11630.11400.11250.11490.14080.11600.13120.1248PromoterGeneSequences0.33420.39870.42650.41590.41680.44550.40190.42370.3946Echocardiogram0.46450.47430.48410.47950.52760.47690.47450.50410.4993Lymphography0.26010.25420.25570.24210.24620.28950.25460.28140.2710IrisClassification0.14500.14150.14410.14720.15260.14720.14240.15640.1559TeachingAssistantEvaluation0.51570.50600.50780.48760.51640.48750.50380.48680.4825Hepatitis0.36540.35480.35610.34270.34920.35430.35380.36420.3516WineRecognition0.17750.17210.17840.17730.17670.19520.17230.21000.1965AutoImports0.30530.27100.26860.24900.25150.27110.27060.24410.2330SonarClassification0.49540.44250.41760.39110.39320.44900.43840.41640.3935GlassIdentification0.36180.35680.35730.35410.38110.36030.35750.36780.3645New-Thyroid0.24270.24400.24630.24440.26620.25920.24380.25350.2518Audiology0.14630.14620.14610.14120.14980.14090.14620.13980.1333Hungarian0.34030.33740.34310.34200.35450.37100.33800.39110.3799HeartDisease(Cleveland)0.37140.36790.37180.36850.38270.38710.36870.41180.3973Haberman'sSurvival0.45520.46440.48010.48140.54220.47030.46260.48870.4853PrimaryTumor0.18100.18010.18010.17920.19350.18130.18020.19340.1890LiverDisorders(Bupa)0.49620.49620.49860.49940.61260.50240.49650.51930.5168Ionosphere0.42600.26730.25910.26060.26110.23690.26620.28220.2672Dermatology0.07250.07550.07670.07970.07860.10090.07540.14170.1306HorseColic0.40670.38810.38770.37400.37740.40570.38850.37530.3667HouseVotes840.29880.20920.19470.20080.20160.22770.20190.19360.1842CylinderBands0.46550.44370.44000.39740.40230.49000.44160.44400.4353Syncon0.18310.13730.12750.13840.13990.12030.13180.17080.1565BalanceScale0.31550.31080.30990.31030.34160.31480.31170.31770.3150CreditScreening0.34140.33580.33650.33260.34630.35720.33460.36280.3518BreastCancer(Wisconsin)0.15970.17770.17800.17780.17150.20020.17830.19150.1834PimaIndiansDiabetes0.43290.43090.43150.42970.47940.43090.43090.45420.4513Vehicle0.43140.35600.35060.34830.36840.35550.35420.36540.3583Annealing0.15350.14290.14110.14170.15140.13730.14160.18630.1843Tic-Tac-ToeEndgame0.43360.40440.31760.34550.33850.41080.40290.31320.2921Vowel0.25560.19450.18380.18550.18710.21780.18690.18320.1747LED0.19860.19930.20030.20070.21930.20330.19940.20990.2088German0.42040.41850.42230.41280.42830.44700.41930.43770.4218ContraceptiveMethodChoice0.46470.45360.45400.45370.51080.45860.45330.50290.4970Volcanoes0.41160.41250.41470.41490.54300.41350.41250.41620.4158CarEvaluation0.22920.20830.19220.19050.17860.18480.19630.18840.1782Segment0.19220.16820.16290.16640.18210.16280.16300.15380.1525Splice-junctionGeneSequences0.15390.14300.14280.19800.19510.17510.14270.28270.2601King-rook-vs-king-pawn0.30490.27190.25410.27120.26040.23920.23540.14490.1265Hypothyroid(Garavan)0.18650.18460.18430.18480.19210.18660.18480.19170.1909Sick-euthyroid0.23850.23020.22660.22810.24410.22640.22470.23170.2305Abalone0.48010.43470.43180.43250.51610.43280.43140.43660.4362SPAME-mail0.45240.45230.45220.45210.58000.45340.45230.45290.4526Waveform-50000.34150.30580.30220.30720.31600.32300.30230.33210.3146Nettalk(Phoneme)0.09530.08840.09090.08830.08860.09860.08440.07600.0731PageBlocks0.16140.15980.15930.15910.17530.15930.15950.15900.1588OpticalDigits0.12270.07670.06990.08530.08520.09210.07620.13010.1172Mushrooms0.13220.01530.00640.02950.02880.01650.01420.01230.0086PenDigits0.16520.10010.08950.09640.09660.11860.09580.09220.0886Sign0.43690.41690.40620.40500.47410.41500.41140.39110.3908Nursery0.17700.15830.14240.14280.13010.14250.14820.10920.1009MAGICGammaTelescope0.40830.39560.39170.39140.41840.39490.39420.38960.3893LetterRecognition0.15730.14530.14040.14210.15170.14890.14440.13590.1344Adult0.36640.35190.34420.34670.36490.34220.34600.35420.3516Connect-4Opening0.35910.33810.33030.34580.37830.33220.33510.32150.3057Census-Income(KDD)0.46280.27310.23080.30580.30850.22750.26600.20980.2048Covertype0.26040.24820.24450.25450.29400.25460.24600.22050.220425 Table12:PerInstanceTrainingTimeDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.005710.004710.005540.005210.012460.006830.245670.014290.04650LungCancer0.004090.005530.716310.648280.698000.025340.437440.030130.25059Labornegotiations0.002120.002190.003490.003330.002950.004470.102250.012470.09011PostoperativePatient0.001510.001460.001660.001600.001240.002200.059100.007910.05682Zoo0.001310.001290.002160.002230.001960.002870.092470.007390.04889PromoterGeneSequences0.001920.001800.185120.155370.177210.012490.224530.016300.14082Echocardiogram0.000880.000960.001040.001110.000880.001570.041790.006260.05076Lymphography0.000910.000930.002050.001860.003550.002280.065300.008890.07238IrisClassification0.001020.000700.000990.000830.000710.001210.034420.002910.01227TeachingAssistantEvaluation0.001000.000830.001250.001040.000770.004270.036000.007420.07446Hepatitis0.001000.000790.002830.001740.001600.001920.056640.008640.07545WineRecognition0.000740.000700.002660.001060.000970.001460.040890.005090.04271AutoImports0.000740.000820.014860.008980.019640.002720.120480.016190.14782SonarClassification0.000750.001130.054170.052250.067120.005140.202190.019830.26942GlassIdentification0.000640.000530.000720.000720.000620.001000.029600.005140.04545New-Thyroid0.000570.000510.000660.000620.000520.002510.024590.002750.01636Audiology0.000680.005520.634030.586340.559590.013652.240210.042930.45907Hungarian0.000430.000410.000740.000790.000650.001210.027060.007120.06659HeartDisease(Cleveland)0.000450.000410.000740.000920.000710.000870.026780.006310.06091Haberman'sSurvival0.000410.000380.000480.000410.000390.000580.016700.002450.01966PrimaryTumor0.000390.000470.001300.001830.002390.001190.133800.016540.16519LiverDisorders(Bupa)0.000350.000300.000410.000800.000370.000590.016950.003280.02885Ionosphere0.000400.000560.004630.005560.006740.001750.077230.011040.10150Dermatology0.000400.000730.020990.019900.028240.002300.155390.010260.09502HorseColic0.000370.000400.001620.001610.002140.001000.041040.027230.25749HouseVotes840.000290.000320.000860.000800.000750.000700.027660.003950.03675CylinderBands0.000300.002660.521350.470190.159900.011250.161610.162641.39439Syncon0.000290.000870.121260.116580.113870.003970.475640.018810.18113BalanceScale0.000230.000180.000230.000210.000200.000290.009800.002030.01860CreditScreening0.000230.000220.000650.000650.000630.000480.022630.008780.08624BreastCancer(Wisconsin)0.000190.000190.000360.000360.000840.000420.015050.002520.02392PimaIndiansDiabetes0.000190.000160.000240.000240.000230.000300.011790.004460.04201Vehicle0.000170.000200.000930.000920.001270.000520.038820.011480.11048Annealing0.000190.000360.010700.010430.014150.000820.181880.057060.62783Tic-Tac-ToeEndgame0.000140.000130.000240.000240.000220.000250.011900.004300.03970Vowel0.000140.000160.000620.000580.000710.000440.047110.009070.09508LED0.000120.000110.000170.000170.000180.000220.016480.003800.03551German0.000170.000190.001140.001130.001390.000510.031430.009580.09459ContraceptiveMethodChoice0.000100.000100.000200.000200.000180.000290.012110.006360.06713Volcanoes0.000090.000080.000090.000080.000080.000110.005510.000930.00834CarEvaluation0.000080.000080.000110.000120.000110.000140.011920.002440.02308Segment0.000080.000130.001250.000940.001110.000300.061860.006080.06088Splice-junctionGeneSequences0.000150.000580.067430.068420.064330.002000.231560.022160.21648King-rook-vs-king-pawn0.000100.000240.004400.004460.004630.000620.103760.015540.15649Hypothyroid(Garavan)0.000090.000170.002900.002650.002770.000410.091190.022190.22663Sick-euthyroid0.000090.000160.002440.002480.002540.000430.051230.017830.17659Abalone0.000050.000050.000120.000120.000110.000080.010160.002930.02821SPAME-mail0.000130.000490.017670.017500.016890.001410.142330.113171.15056Waveform-50000.000100.000280.007240.006870.007100.000760.116780.022980.21661Nettalk(Phoneme)0.000040.000310.018170.016160.015080.001690.054300.017620.17862PageBlocks0.000040.000050.000170.000170.000170.000090.015040.002710.02672OpticalDigits0.000120.000400.049190.048990.046320.001260.459070.025290.21022Mushrooms0.000060.000120.001380.001480.001490.000280.026410.004570.04698PenDigits0.000040.000080.000590.000590.000700.000160.059310.007760.07362Sign0.000030.000030.000100.000100.000100.000060.012970.003600.03670Nursery0.000030.000030.000110.000100.000100.000060.017430.003070.02825MAGICGammaTelescope0.000030.000040.000150.000150.000150.000070.010420.004940.04798LetterRecognition0.000050.000080.000670.000630.000700.000170.139070.013970.13481Adult0.000040.000070.000430.000400.000420.000120.022770.019490.19112Connect-4Opening0.000100.000290.007220.007420.007250.000760.170490.037720.37281Census-Income(KDD)0.000100.000330.013040.013260.012790.000780.110650.06108|Covertype0.000130.000440.017230.017560.017070.001180.441940.108640.8734026 Table13:PerInstanceClassi cationTimeDatasetNBAODEA2DEPA2DEFA2DETANMAPLMGRF10RF100Contact-lenses0.000250.000290.000710.000250.000330.000330.000630.000080.00021LungCancer0.000220.004160.142090.143310.140190.000410.004590.000090.00025Labornegotiations0.000090.000260.000720.001000.000720.000120.000650.000040.00074PostoperativePatient0.000140.000170.000580.000500.000500.000230.000440.000120.00058Zoo0.000180.000930.004460.004660.004370.000180.001660.000100.00039PromoterGeneSequences0.000130.002810.150110.139260.140860.000250.003180.000100.00034Echocardiogram0.000100.000150.000160.000150.000160.000080.000230.000090.00062Lymphography0.000230.000760.004730.004590.004160.000180.001120.000160.00049IrisClassification0.000150.000150.000230.000130.000110.000090.000230.000070.00029TeachingAssistantEvaluation0.000260.000130.000190.000250.000140.000170.000280.000150.00047Hepatitis0.000150.000460.005540.003170.003070.000120.000680.000120.00057WineRecognition0.000100.000420.003240.001840.001640.000120.000580.000130.00042AutoImports0.000190.002270.041200.022350.022010.000260.002960.000090.00068SonarClassification0.000160.003780.161240.164180.152680.000190.003380.000080.00067GlassIdentification0.000080.000270.000680.000680.000620.000080.000370.000120.00054New-Thyroid0.000090.000110.000170.000170.000180.000100.000210.000080.00036Audiology0.000730.043191.259241.321991.264690.001920.045730.000200.00169Hungarian0.000080.000220.000700.000790.000670.000090.000380.000060.00066HeartDisease(Cleveland)0.000060.000260.001270.001420.001260.000070.000400.000080.00054Haberman'sSurvival0.000050.000040.000060.000070.000050.000050.000140.000060.00030PrimaryTumor0.000240.003000.013500.024240.013610.000450.004790.000220.00217LiverDisorders(Bupa)0.000080.000120.000180.000280.000170.000050.000170.000090.00054Ionosphere0.000130.001320.023400.028680.023420.000120.001400.000110.00058Dermatology0.000170.003130.044340.045400.044700.000260.003730.000080.00052HorseColic0.000080.000410.002760.002790.002710.000110.000620.000300.00399HouseVotes840.000090.000390.002450.002550.002430.000100.000530.000070.00040CylinderBands0.000100.001540.040880.042530.040650.000150.002030.000080.00076Syncon0.000250.010670.333850.340020.340390.000440.010240.000100.00069BalanceScale0.000030.000080.000080.000080.000070.000040.000160.000060.00042CreditScreening0.000060.000310.001920.001980.001920.000080.000460.000080.00078BreastCancer(Wisconsin)0.000050.000170.000420.000450.000410.000070.000280.000040.00028PimaIndiansDiabetes0.000040.000150.000350.000360.000320.000070.000210.000070.00067Vehicle0.000090.000850.005770.006070.005810.000110.001050.000090.00122Annealing0.000140.001090.003350.003490.003370.000160.002370.000500.01097Tic-Tac-ToeEndgame0.000050.000160.000450.000480.000450.000050.000260.000060.00059Vowel0.000130.001200.004840.005050.004820.000190.001750.000110.00144LED0.000090.000370.000580.000640.000570.000120.000770.000100.00076German0.000050.000490.004630.004760.004680.000070.000630.000070.00078ContraceptiveMethodChoice0.000040.000180.000530.000570.000530.000050.000360.000080.00134Volcanoes0.000040.000070.000050.000060.000050.000040.000180.000040.00035CarEvaluation0.000040.000140.000200.000210.000190.000050.000300.000050.00051Segment0.000100.001490.013940.010960.010720.000170.001780.000090.00069Splice-junctionGeneSequences0.000110.004640.208740.219270.212330.000240.004890.000080.00128King-rook-vs-king-pawn0.000060.001200.024910.025680.025050.000110.001450.000090.00106Hypothyroid(Garavan)0.000080.001530.019720.018050.018060.000140.001850.000160.00280Sick-euthyroid0.000050.000800.011350.011980.011440.000090.000980.000120.00128Abalone0.000030.000170.000420.000440.000420.000050.000290.000070.00064SPAME-mail0.000100.003430.118470.120600.118440.000160.003000.000310.00310Waveform-50000.000090.002630.058070.057220.056320.000150.002480.000110.00293Nettalk(Phoneme)0.000300.002200.003130.003370.002970.000610.006260.000240.00239PageBlocks0.000050.000380.001180.001250.001180.000080.000570.000060.00053OpticalDigits0.000300.012460.287980.292860.280910.000550.010340.000150.00249Mushrooms0.000040.000540.005970.006010.005810.000070.000710.000040.00034PenDigits0.000120.001640.008760.009230.008810.000200.002040.000120.00240Sign0.000030.000160.000420.000440.000420.000040.000280.000070.00084Nursery0.000040.000240.000540.000580.000550.000060.000450.000070.00121MAGICGammaTelescope0.000030.000160.000610.000630.000600.000040.000250.000080.00094LetterRecognition0.000290.004170.022290.023700.022350.000480.004960.000320.00595Adult0.000040.000270.001690.001670.001600.000050.000370.000420.00803Connect-4Opening0.000110.002890.066170.068750.065190.000170.002560.000390.00682Census-Income(KDD)0.000090.001770.053120.053130.051930.000130.001800.00060|Covertype0.000220.009260.248540.249370.242170.000450.008890.000400.0156127 [5]Flikka,K.,Martens,L.,Vandekerckhove,J.,Gevaert,K.,Eidhammer,I.:Improvingthereliabilityandthroughputofmassspectrometry-basedpro-teomicsbyspectrumquality ltering.Proteomics6(7)(2006)2086{2094[6]Orhan,Z.,Altan,Z.:Impactoffeatureselectionforcorpus-basedWSDinturkish.In:Proceedingsofthe fthMexicanInternationalConferenceonArti cialIntelligence,SpringerBerlin/Heidelberg(2006)868{878[7]Lasko,T.A.,Atlas,S.J.,Barry,M.J.,Chueh,K.H.C.:Automatedidenti -cationofaphysician'sprimarypatients.JournaloftheAmericanMedicalInformaticsAssociation13(1)(2006)74{79[8]Hunt,K.:EvaluationofNovelAlgorithmstoOptimizeRiskStrati cationScoresinMyocardialInfarction.PhDthesis,DepartmentofInformatics,UniversityofZurich(2006)[9]Ferrari,L.D.,Aitken,S.:MininghousekeepinggeneswithanaiveBayesclassi er.BMCGenomics7(1)(2006)277[10]Birzele,F.,Kramer,S.:Anewrepresentationforproteinsecondarystruc-turepredictionbasedonfrequentpatterns.Bioinformatics22(21)(2006)2628{2634[11]Kunchevaa,L.I.,Vilas,V.J.D.R.,Rodrguezc,J.J.:Diagnosingscrapieinsheep:Aclassi cationexperiment.ComputersinBiologyandMedicine37(8)(2007)1194{1202[12]Lau,Q.P.,Hsu,W.,Lee,M.L.,Mao,Y.,Chen,L.:Predictionofcerebralaneurysmrupture.In:ProceedingsofthenineteenthIEEEInternationalConferenceonToolswithArti cialIntelligence,Washington,DC,USA,IEEEComputerSociety(2007)350{357[13]Masegosa,A.,Joho,H.,Jose,J.:Evaluatingquery-independentobjectfeaturesforrelevancyprediction.AdvancesinInformationRetrieval(2007)283{294[14]Wang,H.,Klinginsmith,J.,Dong,X.,Lee,A.,Guha,R.,Wu,Y.,Crippen,G.,Wild,D.:ChemicaldataminingoftheNCIhumantumorcelllinedatabase.JournalofChemicalInformationandModeling47(6)(2007)2063{2076[15]Eduardo,A.,Iakes,E.,Beatriz,G.,Alfonso,V.,David,J.:EcID.AdatabasefortheinferenceoffunctionalinteractionsinE.coli.NucleicAcidsResearch(2008)[16]Garcia,B.,Aler,R.,Ledezma,A.,Sanchis,A.:Protein-proteinfunctionalassociationpredictionusinggeneticprogramming.In:ProceedingsoftheTenthAnnualConferenceonGeneticandEvolutionaryComputation,NewYork,NY,USA,ACM(2008)347{34828 [17]Tian,Y.,Chen,C.,Zhang,C.:Aodeforsourcecodemetricsforimprovedsoftwaremaintainability.In:FourthInternationalConferenceonSeman-tics,KnowledgeandGrid.(2008)330{335[18]Kurz,D.,Bernstein,A.,Hunt,K.,Radovanovic,D.,Erne,P.,Siudak,Z.,Bertel,O.:Simplepoint-of-careriskstrati cationinacutecoronarysyndromes:theAMISmodel.BritishMedicalJournal95(8)(2009)662[19]Leon,A.,etal.:EcID.Adatabasefortheinferenceoffunctionalinterac-tionsinE.coli.NucleicAcidsResearch37(Databaseissue)(2009)D629[20]Shahri,S.,Jamil,H.:AnExtendableMeta-learningAlgorithmforOntol-ogyMapping.FlexibleQueryAnsweringSystems(2009)418{430[21]Simpson,M.,Demner-Fushman,D.,Sneiderman,C.,Antani,S.,Thoma,G.:Usingnon-lexicalfeaturestoidentifye ectiveindexingtermsforbiomedicalillustrations.In:Proceedingsofthe12thConferenceoftheEuropeanChapteroftheAssociationforComputationalLinguistics,Asso-ciationforComputationalLinguistics(2009)737{744[22]A endey,L.,Paris,I.,Mustapha,N.,Sulaiman,M.,Muda,Z.:Rank-ingofIn\ruencingFactorsinPredictingStudents'AcademicPerformance.InformationTechnologyJournal9(4)(2010)832{837[23]Garca-Jimenez,B.,Juan,D.,Ezkurdia,I.,Andres-Leon,E.,Valencia,A.:InferenceofFunctionalRelationsinPredictedProteinNetworkswithaMachineLearningApproach.PLoSONE(4)(2010)e9969[24]Hopfgartner,F.,Urruty,T.,Lopez,P.,Villa,R.,Jose,J.:Simulatedevaluationoffacetedbrowsingbasedonfeatureselection.MultimediaToolsandApplications47(3)(2010)631{662[25]Liew,C.,Ma,X.,Yap,C.:Consensusmodelforidenti cationofnovelPI3Kinhibitorsinlargechemicallibrary.Journalofcomputer-aidedmoleculardesign24(2)(2010)131{141[26]Ting,K.M.,Wells,J.R.,Tan,S.C.,Teng,S.W.,Webb,G.I.:Feature-subspaceaggregating:Ensemblesforstableandunstablelearners.MachineLearning(in-press)[27]Cerquides,J.,Mantaras,R.L.D.:RobustBayesianlinearclassi erensem-bles.In:ProceedingsoftheSixteenthEuropeanConferenceonMachineLearning.(2005)70{81[28]Jiang,L.,Zhang,H.:Weightilyaveragedone-dependenceestimators.PRI-CAI2006:TrendsinArti cialIntelligence970{974[29]Yang,Y.,Webb,G.I.,Cerquides,J.,Korb,K.B.,Boughton,J.,Ting,K.M.:Toselectortoweigh:Acomparativestudyoflinearcombinationschemesforsuperparent-one-dependenceestimators.IEEETransactionsonKnowl-edgeandDataEngineering19(12)(2007)1652{166529 [30]Sahami,M.:LearninglimiteddependenceBayesianclassi ers.In:Pro-ceedingsoftheSecondInternationalConferenceonKnowledgeDiscoveryinDatabases,MenloPark,CA:AAAIPress(1996)334{338[31]Friedman,N.,Geiger,D.,Goldszmidt,M.:Bayesiannetworkclassi ers.MachineLearning29(2)(1997)131{163[32]Breiman,L.:Randomforests.MachineLearning45(2001)5{32[33]Witten,I.H.,Frank,E.:DataMining:PracticalMachineLearningToolsandTechniques.MorganKaufmann(2005)[34]Langley,P.,Sage,S.:InductionofselectiveBayesianclassi ers.In:Pro-ceedingsoftheTenthConferenceonUncertaintyinArti cialIntelligence,MorganKaufmann(1994)399{406[35]Pazzani,M.J.:ConstructiveinductionofCartesianproductattributes.ISIS:Information,StatisticsandInductioninScience(1996)66{77[36]Domingos,P.,Pazzani,M.J.:Beyondindependence:ConditionsfortheoptimalityofthesimpleBayesianclassi er.In:ProceedingsoftheThir-teenthInternationalConferenceonMachineLearning,MorganKaufmann(1996)105{112[37]Zheng,Z.,Webb,G.I.:LazylearningofBayesianrules.MachineLearning41(1)(2000)53{84[38]Yang,Y.,Webb,G.,Cerquides,J.,Korb,K.,Boughton,J.,Ting,K.M.:Toselectortoweigh:AcomparativestudyofmodelselectionandmodelweighingforSPODEensembles.In:ProceedingsoftheSeventeenthEuro-peanConferenceonMachineLearning,Springer(2006)533{544[39]Webb,G.I.:Multiboosting:Atechniqueforcombiningboostingandwag-ging.MachineLearning40(2)(2000)159{196[40]Flores,M.,Gamez,J.,Martnez,A.,Puerta,J.:GAODEandHAODE:TwoproposalsbasedonAODEtodealwithcontinuousvariables.In:Pro-ceedingsofthe26thAnnualInternationalConferenceonMachineLearning,ACM(2009)313{320[41]Fayyad,U.M.,Irani,K.B.:Multi-intervaldiscretizationofcontinuous-valuedattributesforclassi cationlearning.In:ProceedingsoftheThir-teenthInternationalJointConferenceonArti cialIntelligence,MorganKaufmann(1993)1022{1029[42]Cestnik,B.:Estimatingprobabilities:Acrucialtaskinmachinelearning.In:ProceedingsoftheNinthEuropeanConferenceonArti cialIntelligence,London:Pitman(1990)147{14930 [43]Zheng,F.,Webb,G.I.:Ecientlazyeliminationforaveraged-onede-pendenceestimators.In:ProceedingsoftheTwenty-thirdInternationalConferenceonMachineLearning,ACMPress(2006)1113{1120[44]Zheng,F.,Webb,G.I.:Findingtherightfamily:Parentandchildselectionforaveragedone-dependenceestimators.In:ProceedingsoftheEighteenthEuropeanConferenceonMachineLearning,SpringerBerlin/Heidelberg(2007)490{50131