421 September 1994 Partially supported by NSF grant DMS9212419 Department of Statistics University of California Berkeley California 94720 brPage 2br Bagging Predictors oBr eiman Departmen t of Statistics Univ ersit y of California at Berk eley Abst ID: 59907
Download Pdf The PPT/PDF document "Bagging Predictors By Leo Breiman Techni..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
BaggingPredictorsLeoBreimanTechnicalReportNo.421September1994PartiallysupportedbyNSFgrantDMS-9212419DepartmentofStatisticsUniversityofCaliforniaBerkeley,California94720 oBrtofStatisticsyofCaliforniaatBerkBaggingpredictorsisamethodforgeneratingmultipleversionsofapre-dictorandusingthesetogetanaggregatedpredictor.Theaggregationaeragesoertheversionswhenpredictinganumericaloutcomeanddoesaotewhenpredictingaclass.Themultipleversionsareformedymakingbootstrapreplicatesofthelearningsetandusingtheseasnewlearningsets.Testsonrealandsimulateddatasetsusingclassicationandregressiontreesandsubsetselectioninlinearregressionshowthatbaggingcangivesubstantialgainsinaccuracy.Thevitalelementistheinstabilityofthepredictionmethod.Ifperturbingthelearningsetcancausesignicanhangesinthepredictorconstructed,thenbaggingcanimproeaccuracy1.IntroductionAlearningsetofconsistsofdatawheretheareeitherclasslabelsoranumericalresponse.Wehaeaprocedureforusingthislearningsettoformapredictor)|iftheinputisepredict).Now,supposewearegivenasequenceoflearningssetshconsistingofindependentobservationsfromthesameunderlyingdistributionas.Ourmissionistousethetogetabetterpredictorthanthesinglelearningsetpredictor).Therestrictionisthatallwarealloedtoworkwithisthesequenceofpredictorsisnumerical,anobviousprocedureistoreplaceythea)wheredenotestheexpectationo,andthesubscriptdenotesaggregation.If partiallysupportedbyNSFgrantDMS-9212419. predictsaclass,thenonemethodofaggregatingtheisboting.Letandtak)=argmax,though,wehaeasinglelearningsetwithouttheluxuryofreplicatesof.Still,animitationoftheprocessleadingtocanbedone.erepeatedbootstrapsamples,andform.Ifisnumerical,takisaclasslabel,lettheotetoform).Wecallthisprocedure\ootstrap"andusetheacronformreplicatedatasets,eachconsistingofcases,drawnatbutwithr,from.Eac)mayappearrepeatedtimesornotatallinanyparticulararereplicatedatasetdrawnfromthebootstrapdistributionapproximatingthedistributionorbackgroundonbootstrapping,seeEfronandTibshiraniTibshiraniAcriticalfactorinwhetherbaggingwillimproeaccuracyistheyoftheprocedureforconstructing.Ifchangesin,i.e.areplicate,producessmallchangesin,thenwillbecloseto.Improtwilloccurforunstableprocedureswhereasmallchangeincanresultinlargehangesin.UnstabilitasstudiedinBreiman[1994]whereitwaspoinoutthatneuralnets,classicationandregressiontrees,andsubsetselectioninlinearregressionwereunstable,while-nearestneighbormethodsworunstableproceduresbaggingworkswell.InSection2webagclas-sicationtreesonavyofrealandsimulateddatasets.Thereductionintestsetmissclassicationratesrangesfrom20%to47%.Insection3re-gressiontreesarebaggedwithreductionintestsetmeansquarederrorondatasetsrangingfrom22%to46%.Section4goesoersometheoreticaljusticationforbaggingandattemptstounderstandwhenitwillorwillnotorkwell.ThisisillustratedbytheresultsofSection5onsubsetselectioninlinearregressionusingsimulateddata.Section6givesconcludingremarks.Thesediscusshowmanybootstrapreplicationsareuseful,baggingnearestborclassiersandbaggingclassprobabilityestimates.Theevidence,bothexperimentalandtheoretical,isthatbaggingcanpushagoodbutunstableprocedureasignicantsteptoardsoptimalitOntheotherhand,itcanslightlydegradetheperformanceofstableproce-dures.Therehasbeenrecenorkintheliteraturewithsomeofthe a ofbagging.Inparticular,therehasbeensomeworkonaeragingandvermultipletrees.tine[1991]gaeaBaesianapproach,KwokandCarter[1990]usedvotingoermultipletreesgeneratedbyusingalternativsplits,andHeathet.al.[1993]usedvotingoermultipletreesgeneratedbeobliquesplits.Dieterrich[1991]shoedthatamethodforcod-ingmanyclassproblemsintoalargenberoftoclassproblemsincreases.Thereissomecommonalityofthisideawithbagging.2.BaggingClassicationTBaggingwasappliedtoclassicationtreesusingthefollowingdatasets:eform(simbreastcancer(Wisconsin)diabetesybeanAlloftheseexcepttheheartdataareintheUCIrepository(ftpics.uci.eduhine-learning-databases).Thedataarebrie ydescribedinSectionestingwasdoneusingrandomdivisionsofeachdatasetintoalearningandtestset,constructingtheusualtreeclassierusingthelearningset,andbaggingthistreeusing50bootstrapreplicates.Thiswasrepeated100timesforeachdataset(specicsaregiveninSection2.3).Theaeragetestsetmissclassicationrateusingasingletreeisdenotedbandthebaggingrateb.Theresultsare: able1MissclassicationRates(PDataSetbreastcancerdiabetesybeanortheweformdatait'sknownthattheminimalattainablerate(BaRate)is14.0%.Usingthisasabase,theexcesserrordropsfrom15.0%toDataSetsable2givesasummaryofthedatasetsandthetestsetsizesused.able2DataSetSummaryDataSet#Samples#ClassesestSetbreastcancerdiabetesybeanTheguresinparenthesesarefortheoriginaldatasets.Theseweremodiedforreasonsdescribedbelowtogivetheas-usednbers.Inallbuttheulatedweformdata,thedatasetwasrandomlydividedintoatestsetandlearningset.So,forinstance,intheglassdata,thesizeofthelearningsetineachiterationwas194=21420.Forthesimulatedweformdata,alearningsetof300andatestsetof1500weregeneratedforeachiteration.BriefdescriptionsofthedatasetsfolloMoreextendedbackgroundisailableintheUCIrepository Thisissimulated21variabledatawith300casesand3classeshhavingprobabilitItisdescribedinBreimanetal[1984](aCsubroutineforgeneratingthedataisintheUCIrepositorysubdirectoryThisisdatafromthestudyreferredtointheopeningparagraphsoftheCARTbook(Breimanet.al.[1984]).Toquote:ttheUnivyofCalifornia,SanDiegoMedicalCenter,whenaheartattackpatientisadmitted,19variablesaremeasuredduringtherst24hours.Theseincludebloodpressure,age,and17otherorderedandbinaryvariablessummarizingthemedicalsymptomsconsideredasimportantindicatorsofthepatient'scondition.Thegoalofarecentmedicalstudy(seeChapter6)wasthedevtofamethodtoidentifyhighriskpatients(thosewhowillnotsurviveatleast30days)onthebasisoftheinitial24-hourThedatabasehasalsobeenstudiedinOlshenetal[1985].Itwasgath-eredonaproject(SCOR)headedbyJohnRossJr.ElizabethGilpinandhardOlshenwereinstrumentalinmyobtainingthedata.Thedatausedhad18variables.Twariableswithhighproportionsofmissingdatawdeleted,togetherwithafewothercasesthathadmissingvalues.Thisleft779completecases|77deathsand702survivors.Toequalizeclasssizes,hcaseofdeathwasreplicated9timesgiving693deathsforatotalof1395cases.astCancThisisdatagiventotheUCIrepositorybyWilliamH.Wberg,UnivyofWisconsinHospitals,Madison(seeWolbergandMan-gasariam[1990]).Itistoclassdatawith699cases,(458benignand241t).Ithas9variablesconsistingofcellularcharacteristics.(subdi-rectory/breast-cancer-wisconsin)ThisisradardatagatheredbytheSpacePhysicsGroupatJohnsHopkinsUnivy(seeSigillitoet.al.[1989]).Thereare351caseswith34ariables,consistingof2attributesforeachat17pulsenbers.Thereareoclasses:good=sometypeofstructureintheionosphere(226);bad=nostructure(125).(subdirectory/ionosphere) ThisisadatabasegatheredamongthePimaIndiansbytheNa-tionalInstituteofDiabetesandDigestiveandKidneyDiseases.(SeeSmithSmith)Thedatabaseconsistsof768cases,8variablesandtclasses.Thevariablesaremedicalmeasurementsonthepatientplusageandpregnancyinformation.Theclassesare:testedpositivefordiabetes(268)ornegative(500).Toequalizeclasssizes,thediabetescaseswereduplicatedgivingatotalsamplesizeof1036.(subdirectory/pima-indians-diabetes)ThisdatabasewascreatedintheCentralResearchEstablishmenHomeOceForensicScienceServiceAldermaston,Reading,Berkshire.Eaccaseconsistsof9chemicalmeasurementsononeof6typesofglass.Thereare214cases.Thesoybeanlearningsetconsistsof307cases,35variablesand19classes.Theclassesarevarioustypesofsoybeandiseases.ThevablesareobservationontheplantstogetherwithsomeclimaticvAllarecategorical.Somemissingvalueswerelledin.(subdirectory/sobean/soybean (subdirectory/glass)Inallruns,thefollowingprocedurewasused:i).ThedatasetwasrandomlydividedintoatestsetandlearningsetThetestsetssizesselectedintherealdatasetsareadhoc,mostlychosensoouldbereasonablylarge.Insimulateddata,testsetsizewasccomfortablylarge.Aclassicationtreewasconstructedfrom,withselectiondoneb10-foldcross-vRunningthetestsetwnthistreegivesthemissclassicationrateiii).Abootstrapsampleisselectedfrom,andatreegrownusingand10-foldcross-validation.Thisisrepeated50timesgivingtreeclassiersiv).If(,thentheestimatedclassofisthatclasshavingtheyin;:::;').Theproportionoftimestheestimatedclass diersfromthetrueclassisthebaggingmissclassicationratev)Therandomdivisionofthedataisrepeated100timesandthereportedaretheaeragesoerthe100iterations.3.BaggingRegressionTBaggingtreeswasusedon5datasetswithnumericalresponses.BostonHousingriedman#1riedman#2riedman#3Thecomputingschemewassimilartothatusedinclassication.Learningandtestsetswererandomlyselected,25bootstrapreplicationsused,and100iterations.Theresultsare:able3MeanSquaredTestSetErrorDataSetBostonHousingriedman#1riedman#2riedman#3DataSetsable4SummaryofDataSetsDataSetestSetBostonHousingriedman#1riedman#2riedman#3 BostonHousingThisdatabecamewwnthroughitsuseinthebookyBelsley,Kuh,andWh[1980].Ithas506casescorrespondingtocensustractsinthegreaterBostonarea.Theariableismedianhousingpriceinthetract.Thereare12predictorvariables,mainlysocio-economic.Thedatahassincebeenusedinmanystudies.(UCIrepository/housing).Theozonedataconsistsof366readingsofmaximumdailyozoneatahotspotintheLosAngelesbasinand9predictorvariables|allmeteorlogi-cal,i.e.temperature,h,etc.ItisdescribedinBreimanandF[1985]andhasalsobeenusedinmanysubsequentstudies.Eliminatingoneariablewithmanymissingvaluesandafewothercasesleaesadatasetwith330completecasesand8vdman#1AllthreeFriedmandatasetsaresimulateddatathatappearintheMARSpaper(Friedman[1991]).Intherstdataset,thereareteninde-pendentpredictorvhofwhichisuniformlydistributeder[01].Theresponseisgivenb=10sin()+20(+101).Friedmangivesresultsforthismodelforsamplesizes50,100,200.Weusesamplesize200.dman#2,#3Thesetoexamplesaretakentosimulatetheimpedanceandphaseshiftinanalternatingcurrentcircuit.Theyare4variabledata=tan areuniformlydistributedoertheranges Thenoisearedistributedas)withselectedtoe3:1signal/noiseratios.Ineachexample,thesamplesizesare200.Thetorealdatasetsweredividedatrandominalearningsettestsetoreachofthesimulateddatasets,alearningsetof200casesasgeneratedandatestsetof1000cases.Aregressiontreewasgroand10-foldcross-vThetestsetasrundoemean-squared-errorThen25bootstrapreplicateseregenerated.oreachone,aregressiontreewasgrownusingand10-foldcross-ve25predictors;:::;').Foreac,thepredictedaluewastakenas).Then)isthemean-squared-errorbeteenthe^andthetruealuesin.Thisprocedurewasrepeated100timesandtheerrorsaeragedtogivethesingletreeerrorandthebaggederror4.WhyBaggingWLeteac)caseinbeindependentlydrawnfromtheprobabilit.Supposeisnumericaland)thepredictor.Thentheaggregatedpredictoristoberandomvariableshavingthedistributionandindependen.Theaeragepredictionerror)isDenetheerrorintheaggregatedpredictortobeUsingtheinequalitEY' hasloermean-squaredpredictionerrorthanerdependsonhowunequalthetosidesofofEL'(x;L)]2EL'2(x;L)are.Theeectofinstabilityisclear.If)doesnotchangetoomwithreplicatethetosideswillbenearlyequal,andaggregationwillnothelp.Themorehighlyvariablethe)are,themoreimproaggregationmayproduce.Butysimproesonw,thebaggedestimateisnot),butratheristhedistributionthatconcentratesmass1ateachpoiniscalledthebootstrapapproximationtoiscaughtintocurrents:ontheonehand,iftheprocedureisunstable,itcangiveimprotthroughaggregation.Ontheotherside,iftheproce-dureisstable,then)willnotbeasaccuratefordatadraThereisacross-oerpointbeteeninstabilityandstabilityatwhicstopsimprovingon)anddoesworse.Thishasavividillustrationinthelinearregressionsubsetselectionexampleinthenextsection.Thereisanotherobviouslimitationofbagging.Forsomedatasets,itmayhappen)isclosetothelimitsofaccuracyattainableonthatdata.Thennoamountofbaggingwilldomhimproving.Thisisalsoillustratedinthenextsection.Inclassication,apredictor)predictsaclasslabelisdrawnfromthedistribution,andarefromindependentof,thentheprobabilityofcorrectclassicationforxedis;Then,aeragedo,theprobabilityofcorrectclassicationis )istheo)=argmax(argmax)istheindicatorfunction.Considertheset;argmax)=argmax(argmax)=maxsothatThehighestattainablecorrectclassicationrateisgivenbythepredictor)=argmaxandhasthecorrectclassicationrate,thesum)canbelessthanmax).Thenif1,theunaggregatedpredictorcanbefarfromoptimal.isnearlyoptimal.Aggregatingcanthereforetransformgoodpre-dictorsintonearlyoptimalones.Ontheotherhand,unlikethensituation,poorpredictorscanbetransformedintoworseones.Thesamebehaviorregardingstabilityholds.Baggingunstableclassiersusuallyim-esthem.Baggingstableclassiersisnotagoodidea. 5.ALinearRegressionIllustrationardVariableSelectionSubsetselectioninlinearregressiongivesanillustrationofthepoinmadeintheprevioussection.Withdataoftheform;:::;x)consistsofpredictorvariables,apopu-larpredictionmethodconsistsofformingpredictors;:::;')whereislinearinanddependsononlyoftheariables.Thenoneoftheischosenasthedesignatedpredictor.FormorebacseeBreimanandSpector[1993].Acommonmethodforconstructingthe,andonethatisusedinoursimulation,isforwardvariableenIfthevariablesusedin,thenforeac62f;:::;mformthelinearregressionofon(;:::;x),computetheresidualsum-of-squaresRSS()andtakhthatesRSS()and)thelinearregressionbasedon(;:::;xThereareotherformsofvariableselectioni.e.bestsubsets,bacandvtsthereof.Whatisclearaboutallofthemisthattheyareunstableprocedures(seeBreiman[1994]).ThevariablesarecompetingforinclusionintheandsmallchangesinthedatacancauselargechangesintheulationStructureThesimulateddatausedinthissectionaredrawnfromthemodel.1).Thenberofv=30andthesamplesizeisaredrawnfromamean-zerojointnormaldistributionwithandateachiteration,isselectedfromauniformdistributionon[0Itisknownthatsubsetselectionisnearlyoptimalifthereareonlyafewlargenon-zero,andthatitsperformanceispooriftherearemanysmallbutnon-zeroobridgethespectrum,threesetsofcoecientsareused.hsetofcoecientsconsistsofthreeclusters;oneiscenteredat=5,oneat=15andtheotherat=25.Eachclusterisoftheform istheclustercenter,and5fortherst,secondandthirdsetofcoecientsrespectiv.Thenormalizingconstanistakensothatforthedatais75.Thus,for=1,thereareonlythreenon-zero=3thereare15non-zero,andfor=5,thereare27,allrelativelysmall.oreachsetofcoecients,thefollowingprocedurewasreplicated250i).Dataasdrawnfromthemodelwheretheeredrawnfromthejointnormaldistributiondescribedaboardentryofvariableswasdoneusingtogetthepredictors).Themean-squaredpredictionerrorofeachofthesewcomputedgivingiii).Fiftybootstrapreplicateseregenerated.Foreachofthese,ardstepwiseregressionwasappliedtoconstructpredictors.Thesewereaeragedoerthetogivthebaggedsequence;:::;').Thepredictionerrorsforthissequencewascomputed.Thesecomputedmean-squared-errorswereaeragedoerthe250repeti-tionstogivosequencesoreachsetofcoecients,theseosequencesareplottedvs.inFigure1a,b,c.DiscussionofSimulationResultsFirstandmostobviousisthatthebestbaggedpredictorisalwysatleastasgoodasthebestsubsetpredictor.When=1andsubsetselectionisnearlyoptimal,thereisnoimprot.F=3and5thereissubstant.Thisillustratestheobvious:baggingcanimproeonlyifthebaggedisnotoptimal.Thesecondpointislessobvious.Notethatinallthreegraphsthereisapointpastwhichthebaggedpredictorshaelargerpredictionerrorthanthebagged.Theexplanationisthis:linearregressionusingallariablesisafairlystableprocedure.Thestabilitydecreasesasthenberofvusedinthepredictordecreases.Asnotedinsection4,forastableprocedure)isnotasaccurateasThehighervaluesof forlargere ectthisfact.Asdecreases,theinstabilityincreasesandthereisacross-oerpointofwhicbecomesmoreaccuratethan6.ConcludingRemarksBaggingClassProbabilityEstimatesSomeclassicationmethodsestimateprobabilities^)thatanobjectwithpredictionvbelongstoclass.Thentheclasscorrespondingtoisestimatedasargmax).Forsuchmethods,anaturalcompetitortobaggingbotingistoaeragethe^erallbootstrapreplications,getting^)andthenusetheestimatedclassargmaxestimatewascomputedineveryclassicationexamplewedon.Theresultingmissclassicationratewasalwysvirtuallyidenticaltothevmissclassicationrate.Insomeapplications,estimatesofclassprobabilitiesarerequired,insteadof,oralongwith,theclassications.Theevidencesofarindicatesthatbaggedestimatesarelikelytobemoreaccuratethanthesingleestimates.Terifythis,itwouldbenecessarytocomparebothestimateswiththetruevertheinthetestset.Forrealdatathetruealuesareunknown.Buttheycanbecomputedforthesimulatedwdata,wheretheyreducetocomputinganexpressioninolvingerrorfunctions.Usingtheweformdata,wedidasimulationsimilartothatinSection2withlearningandtestsetsbothofsize300,and25bootstrapreplications.Ineachiteration,wecomputedtheaerageoerthetestsetandclassesof.Thiswasrepeated50timesandtheresultsaThesingletreeestimateshadanerrorof.189.errorofthebaggedestimateswas.124,adecreaseof34%.wManyBootstrapReplicatesAreEnough?Inourexperiments,50bootstrapreplicateswasusedforclassicationand25forregression.Thisdoesnotmeanthat50or25werenecessaryorsucient,butsimplythattheyseemedreasonable.Mysenseofitisthatfewerarerequiredwhenisnumericalandmorearerequiredwithanincreasingnberofclasses.TheanswerisnottooimportantwhenprocedureslikeCARTareused,becauserunningtimes,evenforalargenberofbootstraps,areverynomi-nal.Butneuralnetsprogressmhsloerandreplicationsmayrequireman ysofcomputing.Still,baggingisalmostadreamprocedureforparallelcomputing.TheconstructionofapredictoroneacproceedswithnoationnecessaryfromtheotherCPU's.ogivesomeideasofwhattheresultsareasconnectedwiththenberofbootstrapreplicateswerantheweformdatausing10,25,50and100replicatesusingthesamesimulationschemeasinSection2.Theresultsare:able5.1BaggedMissclassicationRates(%)No.BootstrapReplicatesMissclassicationRateTheunbaggedrateis29.0,soitsclearthatwearegettingmostoftheimprotusingonly10bootstrapreplicates.Morethan25bootstrapreplicatesisloe'slaborlost.BaggingNearestNeighborClassiersNearestneighborclassierswererunonallthedatasetsdescribedinsection2exceptforthesoybeandatawhosevariableswerecategorical.Thesamerandomdivisionintolearningandtestsetswasusedwith100bootstrapreplicates,and100iterationsineachrun.AEuclideanmetricwasusedwithhcoordinatestandardizedbydividingbyitsstandarddeviationoerthelearningset.SeeTable5fortheresults:able5MissclassicationRatesforNearestNeighborDataSetbreastcancerdiabetes Nearestneighborismoreaccuratethansingletreesin5ofthe6datasets,butbaggedtreesaremoreaccuratein5ofthe6datasets.Cyclesdidnothaetobeexpendedtondthatbaggingnearestneighborsdoesnotchangethings.Somesimplecomputationsshowwhpossibleoutcomesofatrial(thecases()inthelearningset)andtrials,theprobabilitythatthethoutcomeisselected0timesisximatelyPoissondistributedwith=1forlarge.Theprobabilitthatthethoutcomewilloccuratleastonceis1Iftherearebootstraprepetitionsina2-classproblem,thenatestcasemahangeclassicationonlyifitsnearestneighborinthelearningsetisnotinthebootstrapsampleinatleasthalfofthereplications.Thisyisgivenbytheprobabilitythatthenberofheadsinofacoinwithprobability.632ofheadsislessthan.Asgetslarger,thisprobabilitygetsverysmall.Analogousresultsholdfor-classproblems.Thestabilityofnearestneighborclassicationmethodswithrespecttoperturbationsofthedatadistinguishesthemfromcompetitorssuchastreesandneuralnets.Bagginggoesawystoardmakingasilkpurseoutofasow'sear,es-peciallyifthesow'searist.Itisarelativelyeasywytoimproeanexistingmethod,sinceallthatneedsaddingisaloopinfrontthatselectsthebootstrapsampleandsendsittotheprocedureandbackendthatdoestheWhatoneloses,withthetrees,isasimpleandinstructure.Whatonegainsisincreasedaccuracy,D.,Kuh,E.,andWh,R.(1980)\RegressionDiagnostics",JohnWileyandSons.Breiman,L.(1994)Heuristicsofinstabilityinmodelselection,TReport,StatisticsDepartment,UnivyofCaliforniaatBerkBreiman,L.,Friedman,J.,Olshen,R.,andStone,C.(1984)\ClassicationandRegressionTrees",W Breiman,L.andFriedman,J.(1985)Estimatingoptimaltransformationsinultipleregressionandcorrelation(withdiscussion),JournaloftheAanStatisticalAsso,580-619.Breiman,L.andSpector,P(1992)SubmodelSelectionandEvaluationinRegression{theX-RandomCase,InternationalReviewof,291-tine,W.(1991)\Learningclassicationtrees",ArticialIntiersinStatistics,edD.J.Hand,ChapmanandHall,London,182-201.h,T.G.andBakiri,G.(1991)Error-correctingoutputcodes:Agen-eralmethodforimprovingmulticlassinductivelearningprograms,Proceed-ingsoftheNinthNationalConferenceonArticialIntelligence(AAAI-91),Anaheim,CA:AAAIPress.Efron,B.,andTibshirani,R.(1993)\AnIntroductiontotheBootstrap".ChapmanandHall.riedman,J.(1991)Multivariateadaptiveregressionsplines(withdiscus-nnalsofStatistics,1-141.Heath,D.,Kasif,S.,andSalzberg,S.(1993)k-dt:ulti-treelearningmethod.ProceedingsoftheSecondInternationalWorkshoponMultistrategyLearning,1002-1007,Chamberyrance,MorganKaufman.ok,S.,andCarter,C.(1990)Multipledecisiontrees,UncertainyinAr-ticialIntelligence4,ed.Shacter,R.,Levitt,T.,Kanal,L.,andLemmer,J.,North-Holland,327-335.Olshen,R.,Gilpin,A.,Henning,H.,LeWinter,M.,Collins,D.,andRoss,J.(1985)Twthprognosisfollowingmocardialinfarction:Classica-tiontrees,logisticregression,andstepwiselineardiscrimination,ProceedingsoftheBerkeleyconferenceinhonorofJerzyNeymanandJackKiefer,L.LeCam;R.Olshen,(Ed),Worth,245-267.Smith,J.,Everhart,J.,Dickson,W.,Knowler,W.,andJohannes,R.(1988)UsingtheADAPlearningalgorithmtoforecasttheonsetofdiabetesmellitus.InProceedingsoftheSymposiumonComputerApplicationsandMedical Care261{265.IEEEComputerSocietyPress.Sigillito,V.G.,Wing,S.P.,Hutton,L.V.,andBaker,K.B.(1989)Classi-cationofradarreturnsfromtheionosphereusingneuralnetHopkinsAPLTalDigest,262-266.olberg,W.andMangasarian,O(1990)Multisurfacemethodofpatternseparationformedicaldiagnosisappliedtobreastcytology,ProceedingsoftheNationalAcademyofSciences,U.S.A.,Volume87,December1990,pp