/
Bagging Predictors By Leo Breiman Technical Report No Bagging Predictors By Leo Breiman Technical Report No

Bagging Predictors By Leo Breiman Technical Report No - PDF document

briana-ranney
briana-ranney . @briana-ranney
Follow
448 views
Uploaded On 2015-05-03

Bagging Predictors By Leo Breiman Technical Report No - PPT Presentation

421 September 1994 Partially supported by NSF grant DMS9212419 Department of Statistics University of California Berkeley California 94720 brPage 2br Bagging Predictors oBr eiman Departmen t of Statistics Univ ersit y of California at Berk eley Abst ID: 59907

421 September 1994 Partially

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Bagging Predictors By Leo Breiman Techni..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

BaggingPredictorsLeoBreimanTechnicalReportNo.421September1994PartiallysupportedbyNSFgrantDMS-9212419DepartmentofStatisticsUniversityofCaliforniaBerkeley,California94720 oBrtofStatisticsyofCaliforniaatBerkBaggingpredictorsisamethodforgeneratingmultipleversionsofapre-dictorandusingthesetogetanaggregatedpredictor.Theaggregationaeragesoertheversionswhenpredictinganumericaloutcomeanddoesaotewhenpredictingaclass.Themultipleversionsareformedymakingbootstrapreplicatesofthelearningsetandusingtheseasnewlearningsets.Testsonrealandsimulateddatasetsusingclassi cationandregressiontreesandsubsetselectioninlinearregressionshowthatbaggingcangivesubstantialgainsinaccuracy.Thevitalelementistheinstabilityofthepredictionmethod.Ifperturbingthelearningsetcancausesigni canhangesinthepredictorconstructed,thenbaggingcanimproeaccuracy1.IntroductionAlearningsetofconsistsofdatawheretheareeitherclasslabelsoranumericalresponse.Wehaeaprocedureforusingthislearningsettoformapredictor)|iftheinputisepredict).Now,supposewearegivenasequenceoflearningssetshconsistingofindependentobservationsfromthesameunderlyingdistributionas.Ourmissionistousethetogetabetterpredictorthanthesinglelearningsetpredictor).Therestrictionisthatallwarealloedtoworkwithisthesequenceofpredictorsisnumerical,anobviousprocedureistoreplaceythea)wheredenotestheexpectationo,andthesubscriptdenotesaggregation.If partiallysupportedbyNSFgrantDMS-9212419. predictsaclass,thenonemethodofaggregatingtheisboting.Letandtak)=argmax,though,wehaeasinglelearningsetwithouttheluxuryofreplicatesof.Still,animitationoftheprocessleadingtocanbedone.erepeatedbootstrapsamples,andform.Ifisnumerical,takisaclasslabel,lettheotetoform).Wecallthisprocedure\ootstrap"andusetheacronformreplicatedatasets,eachconsistingofcases,drawnatbutwithr,from.Eac)mayappearrepeatedtimesornotatallinanyparticulararereplicatedatasetdrawnfromthebootstrapdistributionapproximatingthedistributionorbackgroundonbootstrapping,seeEfronandTibshiraniTibshiraniAcriticalfactorinwhetherbaggingwillimproeaccuracyistheyoftheprocedureforconstructing.Ifchangesin,i.e.areplicate,producessmallchangesin,thenwillbecloseto.Improtwilloccurforunstableprocedureswhereasmallchangeincanresultinlargehangesin.UnstabilitasstudiedinBreiman[1994]whereitwaspoinoutthatneuralnets,classi cationandregressiontrees,andsubsetselectioninlinearregressionwereunstable,while-nearestneighbormethodsworunstableproceduresbaggingworkswell.InSection2webagclas-si cationtreesonavyofrealandsimulateddatasets.Thereductionintestsetmissclassi cationratesrangesfrom20%to47%.Insection3re-gressiontreesarebaggedwithreductionintestsetmeansquarederrorondatasetsrangingfrom22%to46%.Section4goesoersometheoreticaljusti cationforbaggingandattemptstounderstandwhenitwillorwillnotorkwell.ThisisillustratedbytheresultsofSection5onsubsetselectioninlinearregressionusingsimulateddata.Section6givesconcludingremarks.Thesediscusshowmanybootstrapreplicationsareuseful,baggingnearestborclassi ersandbaggingclassprobabilityestimates.Theevidence,bothexperimentalandtheoretical,isthatbaggingcanpushagoodbutunstableprocedureasigni cantsteptoardsoptimalitOntheotherhand,itcanslightlydegradetheperformanceofstableproce-dures.Therehasbeenrecenorkintheliteraturewithsomeofthe a ofbagging.Inparticular,therehasbeensomeworkonaeragingandvermultipletrees.tine[1991]gaeaBaesianapproach,KwokandCarter[1990]usedvotingoermultipletreesgeneratedbyusingalternativsplits,andHeathet.al.[1993]usedvotingoermultipletreesgeneratedbeobliquesplits.Dieterrich[1991]shoedthatamethodforcod-ingmanyclassproblemsintoalargenberoftoclassproblemsincreases.Thereissomecommonalityofthisideawithbagging.2.BaggingClassi cationTBaggingwasappliedtoclassi cationtreesusingthefollowingdatasets:eform(simbreastcancer(Wisconsin)diabetesybeanAlloftheseexcepttheheartdataareintheUCIrepository(ftpics.uci.eduhine-learning-databases).Thedataarebrie ydescribedinSectionestingwasdoneusingrandomdivisionsofeachdatasetintoalearningandtestset,constructingtheusualtreeclassi erusingthelearningset,andbaggingthistreeusing50bootstrapreplicates.Thiswasrepeated100timesforeachdataset(speci csaregiveninSection2.3).Theaeragetestsetmissclassi cationrateusingasingletreeisdenotedbandthebaggingrateb.Theresultsare: able1Missclassi cationRates(PDataSetbreastcancerdiabetesybeanortheweformdatait'sknownthattheminimalattainablerate(BaRate)is14.0%.Usingthisasabase,theexcesserrordropsfrom15.0%toDataSetsable2givesasummaryofthedatasetsandthetestsetsizesused.able2DataSetSummaryDataSet#Samples#ClassesestSetbreastcancerdiabetesybeanThe guresinparenthesesarefortheoriginaldatasets.Theseweremodi edforreasonsdescribedbelowtogivetheas-usednbers.Inallbuttheulatedweformdata,thedatasetwasrandomlydividedintoatestsetandlearningset.So,forinstance,intheglassdata,thesizeofthelearningsetineachiterationwas194=21420.Forthesimulatedweformdata,alearningsetof300andatestsetof1500weregeneratedforeachiteration.BriefdescriptionsofthedatasetsfolloMoreextendedbackgroundisailableintheUCIrepository Thisissimulated21variabledatawith300casesand3classeshhavingprobabilitItisdescribedinBreimanetal[1984](aCsubroutineforgeneratingthedataisintheUCIrepositorysubdirectoryThisisdatafromthestudyreferredtointheopeningparagraphsoftheCARTbook(Breimanet.al.[1984]).Toquote:ttheUnivyofCalifornia,SanDiegoMedicalCenter,whenaheartattackpatientisadmitted,19variablesaremeasuredduringthe rst24hours.Theseincludebloodpressure,age,and17otherorderedandbinaryvariablessummarizingthemedicalsymptomsconsideredasimportantindicatorsofthepatient'scondition.Thegoalofarecentmedicalstudy(seeChapter6)wasthedevtofamethodtoidentifyhighriskpatients(thosewhowillnotsurviveatleast30days)onthebasisoftheinitial24-hourThedatabasehasalsobeenstudiedinOlshenetal[1985].Itwasgath-eredonaproject(SCOR)headedbyJohnRossJr.ElizabethGilpinandhardOlshenwereinstrumentalinmyobtainingthedata.Thedatausedhad18variables.Twariableswithhighproportionsofmissingdatawdeleted,togetherwithafewothercasesthathadmissingvalues.Thisleft779completecases|77deathsand702survivors.Toequalizeclasssizes,hcaseofdeathwasreplicated9timesgiving693deathsforatotalof1395cases.astCancThisisdatagiventotheUCIrepositorybyWilliamH.Wberg,UnivyofWisconsinHospitals,Madison(seeWolbergandMan-gasariam[1990]).Itistoclassdatawith699cases,(458benignand241t).Ithas9variablesconsistingofcellularcharacteristics.(subdi-rectory/breast-cancer-wisconsin)ThisisradardatagatheredbytheSpacePhysicsGroupatJohnsHopkinsUnivy(seeSigillitoet.al.[1989]).Thereare351caseswith34ariables,consistingof2attributesforeachat17pulsenbers.Thereareoclasses:good=sometypeofstructureintheionosphere(226);bad=nostructure(125).(subdirectory/ionosphere) ThisisadatabasegatheredamongthePimaIndiansbytheNa-tionalInstituteofDiabetesandDigestiveandKidneyDiseases.(SeeSmithSmith)Thedatabaseconsistsof768cases,8variablesandtclasses.Thevariablesaremedicalmeasurementsonthepatientplusageandpregnancyinformation.Theclassesare:testedpositivefordiabetes(268)ornegative(500).Toequalizeclasssizes,thediabetescaseswereduplicatedgivingatotalsamplesizeof1036.(subdirectory/pima-indians-diabetes)ThisdatabasewascreatedintheCentralResearchEstablishmenHomeOceForensicScienceServiceAldermaston,Reading,Berkshire.Eaccaseconsistsof9chemicalmeasurementsononeof6typesofglass.Thereare214cases.Thesoybeanlearningsetconsistsof307cases,35variablesand19classes.Theclassesarevarioustypesofsoybeandiseases.ThevablesareobservationontheplantstogetherwithsomeclimaticvAllarecategorical.Somemissingvalueswere lledin.(subdirectory/sobean/soybean (subdirectory/glass)Inallruns,thefollowingprocedurewasused:i).ThedatasetwasrandomlydividedintoatestsetandlearningsetThetestsetssizesselectedintherealdatasetsareadhoc,mostlychosensoouldbereasonablylarge.Insimulateddata,testsetsizewasccomfortablylarge.Aclassi cationtreewasconstructedfrom,withselectiondoneb10-foldcross-vRunningthetestsetwnthistreegivesthemissclassi cationrateiii).Abootstrapsampleisselectedfrom,andatreegrownusingand10-foldcross-validation.Thisisrepeated50timesgivingtreeclassi ersiv).If(,thentheestimatedclassofisthatclasshavingtheyin;:::;').Theproportionoftimestheestimatedclass di ersfromthetrueclassisthebaggingmissclassi cationratev)Therandomdivisionofthedataisrepeated100timesandthereportedaretheaeragesoerthe100iterations.3.BaggingRegressionTBaggingtreeswasusedon5datasetswithnumericalresponses.BostonHousingriedman#1riedman#2riedman#3Thecomputingschemewassimilartothatusedinclassi cation.Learningandtestsetswererandomlyselected,25bootstrapreplicationsused,and100iterations.Theresultsare:able3MeanSquaredTestSetErrorDataSetBostonHousingriedman#1riedman#2riedman#3DataSetsable4SummaryofDataSetsDataSetestSetBostonHousingriedman#1riedman#2riedman#3 BostonHousingThisdatabecamewwnthroughitsuseinthebookyBelsley,Kuh,andWh[1980].Ithas506casescorrespondingtocensustractsinthegreaterBostonarea.Theariableismedianhousingpriceinthetract.Thereare12predictorvariables,mainlysocio-economic.Thedatahassincebeenusedinmanystudies.(UCIrepository/housing).Theozonedataconsistsof366readingsofmaximumdailyozoneatahotspotintheLosAngelesbasinand9predictorvariables|allmeteorlogi-cal,i.e.temperature,h,etc.ItisdescribedinBreimanandF[1985]andhasalsobeenusedinmanysubsequentstudies.Eliminatingoneariablewithmanymissingvaluesandafewothercasesleaesadatasetwith330completecasesand8vdman#1AllthreeFriedmandatasetsaresimulateddatathatappearintheMARSpaper(Friedman[1991]).Inthe rstdataset,thereareteninde-pendentpredictorvhofwhichisuniformlydistributeder[01].Theresponseisgivenb=10sin()+20(+101).Friedmangivesresultsforthismodelforsamplesizes50,100,200.Weusesamplesize200.dman#2,#3Thesetoexamplesaretakentosimulatetheimpedanceandphaseshiftinanalternatingcurrentcircuit.Theyare4variabledata=tan areuniformlydistributedoertheranges Thenoisearedistributedas)withselectedtoe3:1signal/noiseratios.Ineachexample,thesamplesizesare200.Thetorealdatasetsweredividedatrandominalearningsettestsetoreachofthesimulateddatasets,alearningsetof200casesasgeneratedandatestsetof1000cases.Aregressiontreewasgroand10-foldcross-vThetestsetasrundoemean-squared-errorThen25bootstrapreplicateseregenerated.oreachone,aregressiontreewasgrownusingand10-foldcross-ve25predictors;:::;').Foreac,thepredictedaluewastakenas).Then)isthemean-squared-errorbeteenthe^andthetruealuesin.Thisprocedurewasrepeated100timesandtheerrorsaeragedtogivethesingletreeerrorandthebaggederror4.WhyBaggingWLeteac)caseinbeindependentlydrawnfromtheprobabilit.Supposeisnumericaland)thepredictor.Thentheaggregatedpredictoristoberandomvariableshavingthedistributionandindependen.Theaeragepredictionerror)isDe netheerrorintheaggregatedpredictortobeUsingtheinequalitEY' hasloermean-squaredpredictionerrorthanerdependsonhowunequalthetosidesofofEL'(x;L)]2EL'2(x;L)are.Thee ectofinstabilityisclear.If)doesnotchangetoomwithreplicatethetosideswillbenearlyequal,andaggregationwillnothelp.Themorehighlyvariablethe)are,themoreimproaggregationmayproduce.Butysimproesonw,thebaggedestimateisnot),butratheristhedistributionthatconcentratesmass1ateachpoiniscalledthebootstrapapproximationtoiscaughtintocurrents:ontheonehand,iftheprocedureisunstable,itcangiveimprotthroughaggregation.Ontheotherside,iftheproce-dureisstable,then)willnotbeasaccuratefordatadraThereisacross-oerpointbeteeninstabilityandstabilityatwhicstopsimprovingon)anddoesworse.Thishasavividillustrationinthelinearregressionsubsetselectionexampleinthenextsection.Thereisanotherobviouslimitationofbagging.Forsomedatasets,itmayhappen)isclosetothelimitsofaccuracyattainableonthatdata.Thennoamountofbaggingwilldomhimproving.Thisisalsoillustratedinthenextsection.Inclassi cation,apredictor)predictsaclasslabelisdrawnfromthedistribution,andarefromindependentof,thentheprobabilityofcorrectclassi cationfor xedis;Then,aeragedo,theprobabilityofcorrectclassi cationis )istheo)=argmax(argmax)istheindicatorfunction.Considertheset;argmax)=argmax(argmax)=maxsothatThehighestattainablecorrectclassi cationrateisgivenbythepredictor)=argmaxandhasthecorrectclassi cationrate,thesum)canbelessthanmax).Thenif1,theunaggregatedpredictorcanbefarfromoptimal.isnearlyoptimal.Aggregatingcanthereforetransformgoodpre-dictorsintonearlyoptimalones.Ontheotherhand,unlikethensituation,poorpredictorscanbetransformedintoworseones.Thesamebehaviorregardingstabilityholds.Baggingunstableclassi ersusuallyim-esthem.Baggingstableclassi ersisnotagoodidea. 5.ALinearRegressionIllustrationardVariableSelectionSubsetselectioninlinearregressiongivesanillustrationofthepoinmadeintheprevioussection.Withdataoftheform;:::;x)consistsofpredictorvariables,apopu-larpredictionmethodconsistsofformingpredictors;:::;')whereislinearinanddependsononlyoftheariables.Thenoneoftheischosenasthedesignatedpredictor.FormorebacseeBreimanandSpector[1993].Acommonmethodforconstructingthe,andonethatisusedinoursimulation,isforwardvariableenIfthevariablesusedin,thenforeac62f;:::;mformthelinearregressionofon(;:::;x),computetheresidualsum-of-squaresRSS()andtakhthatesRSS()and)thelinearregressionbasedon(;:::;xThereareotherformsofvariableselectioni.e.bestsubsets,bacandvtsthereof.Whatisclearaboutallofthemisthattheyareunstableprocedures(seeBreiman[1994]).ThevariablesarecompetingforinclusionintheandsmallchangesinthedatacancauselargechangesintheulationStructureThesimulateddatausedinthissectionaredrawnfromthemodel.1).Thenberofv=30andthesamplesizeisaredrawnfromamean-zerojointnormaldistributionwithandateachiteration,isselectedfromauniformdistributionon[0Itisknownthatsubsetselectionisnearlyoptimalifthereareonlyafewlargenon-zero,andthatitsperformanceispooriftherearemanysmallbutnon-zeroobridgethespectrum,threesetsofcoecientsareused.hsetofcoecientsconsistsofthreeclusters;oneiscenteredat=5,oneat=15andtheotherat=25.Eachclusterisoftheform istheclustercenter,and5forthe rst,secondandthirdsetofcoecientsrespectiv.Thenormalizingconstanistakensothatforthedatais75.Thus,for=1,thereareonlythreenon-zero=3thereare15non-zero,andfor=5,thereare27,allrelativelysmall.oreachsetofcoecients,thefollowingprocedurewasreplicated250i).Dataasdrawnfromthemodelwheretheeredrawnfromthejointnormaldistributiondescribedaboardentryofvariableswasdoneusingtogetthepredictors).Themean-squaredpredictionerrorofeachofthesewcomputedgivingiii).Fiftybootstrapreplicateseregenerated.Foreachofthese,ardstepwiseregressionwasappliedtoconstructpredictors.Thesewereaeragedoerthetogivthebaggedsequence;:::;').Thepredictionerrorsforthissequencewascomputed.Thesecomputedmean-squared-errorswereaeragedoerthe250repeti-tionstogivosequencesoreachsetofcoecients,theseosequencesareplottedvs.inFigure1a,b,c.DiscussionofSimulationResultsFirstandmostobviousisthatthebestbaggedpredictorisalwysatleastasgoodasthebestsubsetpredictor.When=1andsubsetselectionisnearlyoptimal,thereisnoimprot.F=3and5thereissubstant.Thisillustratestheobvious:baggingcanimproeonlyifthebaggedisnotoptimal.Thesecondpointislessobvious.Notethatinallthreegraphsthereisapointpastwhichthebaggedpredictorshaelargerpredictionerrorthanthebagged.Theexplanationisthis:linearregressionusingallariablesisafairlystableprocedure.Thestabilitydecreasesasthenberofvusedinthepredictordecreases.Asnotedinsection4,forastableprocedure)isnotasaccurateasThehighervaluesof forlargere ectthisfact.Asdecreases,theinstabilityincreasesandthereisacross-oerpointofwhicbecomesmoreaccuratethan6.ConcludingRemarksBaggingClassProbabilityEstimatesSomeclassi cationmethodsestimateprobabilities^)thatanobjectwithpredictionvbelongstoclass.Thentheclasscorrespondingtoisestimatedasargmax).Forsuchmethods,anaturalcompetitortobaggingbotingistoaeragethe^erallbootstrapreplications,getting^)andthenusetheestimatedclassargmaxestimatewascomputedineveryclassi cationexamplewedon.Theresultingmissclassi cationratewasalwysvirtuallyidenticaltothevmissclassi cationrate.Insomeapplications,estimatesofclassprobabilitiesarerequired,insteadof,oralongwith,theclassi cations.Theevidencesofarindicatesthatbaggedestimatesarelikelytobemoreaccuratethanthesingleestimates.Terifythis,itwouldbenecessarytocomparebothestimateswiththetruevertheinthetestset.Forrealdatathetruealuesareunknown.Buttheycanbecomputedforthesimulatedwdata,wheretheyreducetocomputinganexpressioninolvingerrorfunctions.Usingtheweformdata,wedidasimulationsimilartothatinSection2withlearningandtestsetsbothofsize300,and25bootstrapreplications.Ineachiteration,wecomputedtheaerageoerthetestsetandclassesof.Thiswasrepeated50timesandtheresultsaThesingletreeestimateshadanerrorof.189.errorofthebaggedestimateswas.124,adecreaseof34%.wManyBootstrapReplicatesAreEnough?Inourexperiments,50bootstrapreplicateswasusedforclassi cationand25forregression.Thisdoesnotmeanthat50or25werenecessaryorsucient,butsimplythattheyseemedreasonable.Mysenseofitisthatfewerarerequiredwhenisnumericalandmorearerequiredwithanincreasingnberofclasses.TheanswerisnottooimportantwhenprocedureslikeCARTareused,becauserunningtimes,evenforalargenberofbootstraps,areverynomi-nal.Butneuralnetsprogressmhsloerandreplicationsmayrequireman ysofcomputing.Still,baggingisalmostadreamprocedureforparallelcomputing.TheconstructionofapredictoroneacproceedswithnoationnecessaryfromtheotherCPU's.ogivesomeideasofwhattheresultsareasconnectedwiththenberofbootstrapreplicateswerantheweformdatausing10,25,50and100replicatesusingthesamesimulationschemeasinSection2.Theresultsare:able5.1BaggedMissclassi cationRates(%)No.BootstrapReplicatesMissclassi cationRateTheunbaggedrateis29.0,soitsclearthatwearegettingmostoftheimprotusingonly10bootstrapreplicates.Morethan25bootstrapreplicatesisloe'slaborlost.BaggingNearestNeighborClassi ersNearestneighborclassi erswererunonallthedatasetsdescribedinsection2exceptforthesoybeandatawhosevariableswerecategorical.Thesamerandomdivisionintolearningandtestsetswasusedwith100bootstrapreplicates,and100iterationsineachrun.AEuclideanmetricwasusedwithhcoordinatestandardizedbydividingbyitsstandarddeviationoerthelearningset.SeeTable5fortheresults:able5Missclassi cationRatesforNearestNeighborDataSetbreastcancerdiabetes Nearestneighborismoreaccuratethansingletreesin5ofthe6datasets,butbaggedtreesaremoreaccuratein5ofthe6datasets.Cyclesdidnothaetobeexpendedto ndthatbaggingnearestneighborsdoesnotchangethings.Somesimplecomputationsshowwhpossibleoutcomesofatrial(thecases()inthelearningset)andtrials,theprobabilitythatthethoutcomeisselected0timesisximatelyPoissondistributedwith=1forlarge.Theprobabilitthatthethoutcomewilloccuratleastonceis1Iftherearebootstraprepetitionsina2-classproblem,thenatestcasemahangeclassi cationonlyifitsnearestneighborinthelearningsetisnotinthebootstrapsampleinatleasthalfofthereplications.Thisyisgivenbytheprobabilitythatthenberofheadsinofacoinwithprobability.632ofheadsislessthan.Asgetslarger,thisprobabilitygetsverysmall.Analogousresultsholdfor-classproblems.Thestabilityofnearestneighborclassi cationmethodswithrespecttoperturbationsofthedatadistinguishesthemfromcompetitorssuchastreesandneuralnets.Bagginggoesawystoardmakingasilkpurseoutofasow'sear,es-peciallyifthesow'searist.Itisarelativelyeasywytoimproeanexistingmethod,sinceallthatneedsaddingisaloopinfrontthatselectsthebootstrapsampleandsendsittotheprocedureandbackendthatdoestheWhatoneloses,withthetrees,isasimpleandinstructure.Whatonegainsisincreasedaccuracy,D.,Kuh,E.,andWh,R.(1980)\RegressionDiagnostics",JohnWileyandSons.Breiman,L.(1994)Heuristicsofinstabilityinmodelselection,TReport,StatisticsDepartment,UnivyofCaliforniaatBerkBreiman,L.,Friedman,J.,Olshen,R.,andStone,C.(1984)\Classi cationandRegressionTrees",W Breiman,L.andFriedman,J.(1985)Estimatingoptimaltransformationsinultipleregressionandcorrelation(withdiscussion),JournaloftheAanStatisticalAsso,580-619.Breiman,L.andSpector,P(1992)SubmodelSelectionandEvaluationinRegression{theX-RandomCase,InternationalReviewof,291-tine,W.(1991)\Learningclassi cationtrees",Arti cialIntiersinStatistics,edD.J.Hand,ChapmanandHall,London,182-201.h,T.G.andBakiri,G.(1991)Error-correctingoutputcodes:Agen-eralmethodforimprovingmulticlassinductivelearningprograms,Proceed-ingsoftheNinthNationalConferenceonArti cialIntelligence(AAAI-91),Anaheim,CA:AAAIPress.Efron,B.,andTibshirani,R.(1993)\AnIntroductiontotheBootstrap".ChapmanandHall.riedman,J.(1991)Multivariateadaptiveregressionsplines(withdiscus-nnalsofStatistics,1-141.Heath,D.,Kasif,S.,andSalzberg,S.(1993)k-dt:ulti-treelearningmethod.ProceedingsoftheSecondInternationalWorkshoponMultistrategyLearning,1002-1007,Chamberyrance,MorganKaufman.ok,S.,andCarter,C.(1990)Multipledecisiontrees,UncertainyinAr-ti cialIntelligence4,ed.Shacter,R.,Levitt,T.,Kanal,L.,andLemmer,J.,North-Holland,327-335.Olshen,R.,Gilpin,A.,Henning,H.,LeWinter,M.,Collins,D.,andRoss,J.(1985)Twthprognosisfollowingmocardialinfarction:Classi ca-tiontrees,logisticregression,andstepwiselineardiscrimination,ProceedingsoftheBerkeleyconferenceinhonorofJerzyNeymanandJackKiefer,L.LeCam;R.Olshen,(Ed),Worth,245-267.Smith,J.,Everhart,J.,Dickson,W.,Knowler,W.,andJohannes,R.(1988)UsingtheADAPlearningalgorithmtoforecasttheonsetofdiabetesmellitus.InProceedingsoftheSymposiumonComputerApplicationsandMedical Care261{265.IEEEComputerSocietyPress.Sigillito,V.G.,Wing,S.P.,Hutton,L.V.,andBaker,K.B.(1989)Classi- cationofradarreturnsfromtheionosphereusingneuralnetHopkinsAPLTalDigest,262-266.olberg,W.andMangasarian,O(1990)Multisurfacemethodofpatternseparationformedicaldiagnosisappliedtobreastcytology,ProceedingsoftheNationalAcademyofSciences,U.S.A.,Volume87,December1990,pp