/
Package Cubist February   Type Package Title Rule and Package Cubist February   Type Package Title Rule and

Package Cubist February Type Package Title Rule and - PDF document

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
476 views
Uploaded On 2015-05-23

Package Cubist February Type Package Title Rule and - PPT Presentation

018 Date 2014065 Author Max Kuhn Steve Weston Chris Keefer Nathan Coulter C code for Cubist by Ross Quinlan Maintainer Max Kuhn Description Regression modeling using rules with added instancebased corrections Depends lattice Imports reshape2 Suggests ID: 72948

018 Date 2014065 Author Max

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Package Cubist February Type Package T..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Package`Cubist'January10,2020TypePackageTitleRule-AndInstance-BasedRegressionModelingVersion0.2.3MaintainerMaxKuhn&#xmxku;&#xhn@g;&#xmail;&#x.com;DescriptionRegressionmodelingusingruleswithaddedinstance-basedcorrections.DependslatticeImportsreshape2,utilsSuggestsmlbench,caret,knitr,modeldata,dplyr&#xmxku;&#xhn@g;&#xmail;&#x.com;(=0.7.4),rlang,tidyrulesURLhttps://topepo.github.io/CubistBugReportshttps://github.com/topepo/Cubist/issuesLicenseGPL-3LazyLoadyesRoxygenNote7.0.2.9000VignetteBuilderknitrEncodingUTF-8NeedsCompilationyesAuthorMaxKuhn[aut,cre],SteveWeston[ctb],ChrisKeefer[ctb],NathanCoulter[ctb],RossQuinlan[aut](AuthorofimportedCcode),RulequestResearchPtyLtd.[cph](CopyrightholderofimportedCcode)RepositoryCRANDate/Publication2020-01-1017:50:23UTC1 2cubist.defaultRtopicsdocumented:cubist.default........................................2cubistControl........................................4dotplot.cubist........................................6exportCubistFiles......................................7predict.cubist........................................8summary.cubist.......................................10Index14 cubist.defaultFitaCubistmodel DescriptionThisfunctiontstherule-basedmodeldescribedinQuinlan(1992)(akaM5)withadditionalcor-rectionsbasedonnearestneighborsinthetrainingset,asdescribedinQuinlan(1993a).Usage##DefaultS3method:cubist(x,y,committees=1,control=cubistControl(),weights=NULL,...)Argumentsxamatrixordataframeofpredictorvariables.Missingdataareallowedbut(atthistime)onlynumeric,characterandfactorvaluesareallowed.yanumericvectorofoutcomecommitteesaninteger:howmanycommitteemodels(e.g..boostingiterations)shouldbeused?controloptionsthatcontroldetailsofthecubistalgorithm.SeecubistControl()weightsanoptionalvectorofcaseweights(thesamelengthasy)forhowmucheachinstanceshouldcontributetothemodelt.FromtheRuleQuestwebsite:"Therelativeweightassignedtoeachcaseisitsvalueofthisattributedividedbytheaveragevalue;ifthevalueisundened,notapplicable,orislessthanorequaltozero,thecase'srelativeweightissetto1."...optionalargumentstopass(notcurrentlyused)DetailsCubistisaprediction-orientedregressionmodelthatcombinestheideasinQuinlan(1992)andQuinlan(1993).Althoughitinitiallycreatesatreestructure,itcollapseseachpaththroughthetreeintoarule.Aregressionmodelistforeachrulebasedonthedatasubsetdenedbytherules.Thesetofrulesareprunedorpossiblycombined.andthecandidatevariablesforthelinearregressionmodelsarethe cubist.default3predictorsthatwereusedinthepartsoftherulethatwereprunedaway.Thispartofthealgorithmisconsistentwiththe"M5"orModelTreeapproach.Cubistgeneralizesthismodeltoaddboosting(whencommittees�1)andinstancebasedcorrec-tions(seepredict.cubist()).Thenumberofinstancesissetatpredictiontimebytheuserandisnotneededformodelbuilding.ThisfunctionlinksRtotheGPLversionoftheCcodegivenontheRuleQuestwebsite.TheRuleQuestcodedifferentiatesmissingvaluesfromvaluesthatarenotapplicable.Currently,thispackagesdoesnotmakesuchadistinction(allvaluesaretreatedasmissing).Thiswillproduceslightlydifferentresults.Totunethecubistmodeloverthenumberofcommitteesandneighbors,thecaret::train()func-tioninthecaretpackagehasbindingstondappropriatesettingsoftheseparameters.Valueanobjectofclasscubistwithelements:data,names,modelcharacterstringsthatcorrespondtotheircounterpartsforthecommand-linepro-gramavailablefromRuleQuestoutputbasiccubistoutputcapturedfromtheCcode,includingtherules,theirterminalmodelsandvariableusagestatisticscontrolalistofcontrolparameterspassedinbytheusercomposite,neighbors,committeesmirrorsofthevaluestotheseargumentsthatwerepassedinbytheuserdimstheoutputifdim(x)splitsinformationaboutthevariablesandvaluesusedintheruleconditionscallthefunctioncallcoefsadataframeofregressioncoefcientsforeachrulewithineachcommitteevarsalistwithelementsallandusedlistingthepredictorspassedintothefunctionandusedbyanyruleormodelfitted.valuesanumericvectorofpredictionsonthetrainingset.usageadataframewiththepercentofmodelswhereeachvariablewasused.Seesummary.cubist()foradiscussion.Author(s)RcodebyMaxKuhn,originalCsourcesbyRQuinlanandmodicationsbeSteveWestonReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993a)pp.236-243 4cubistControlQuinlan.C4.5:ProgramsForMachineLearning(1993b)MorganKaufmannPublishersInc.SanFrancisco,CAWangandWitten.Inducingmodeltreesforcontinuousclasses.ProceedingsoftheNinthEuropeanConferenceonMachineLearning(1997)pp.128-137http://rulequest.com/cubist-info.htmlSeeAlsocubistControl(),predict.cubist(),summary.cubist(),dotplot.cubist(),caret::train()Exampleslibrary(mlbench)data(BostonHousing)##1committee,sojustanM5fit:mod1cubist(x=BostonHousing[,-14],y=BostonHousing$medv)mod1##Nowwith10committeesmod2cubist(x=BostonHousing[,-14],y=BostonHousing$medv,committees=10)mod2 cubistControlVariousparametersthatcontrolaspectsoftheCubistt. DescriptionMostofthesevaluesarediscussedatlengthinhttp://rulequest.com/cubist-unix.htmlUsagecubistControl(unbiased=FALSE,rules=100,extrapolation=100,sample=0,seed=sample.int(4096,size=1)-1L,label="outcome") cubistControl5Argumentsunbiasedalogical:shouldunbiasedrulesbeused?rulesaninteger(orNA):deneanexplicitlimittothenumberofrulesused(NAlet'sCubistdecide).extrapolationanumberbetween0and100:sinceCubistuseslinearmodels,predictionscanbeoutsideoftheoutsideoftherangeseenthetrainingset.Thisparametercontrolshowmuchrulepredictionsareadjustedtobeconsistentwiththetrainingset.sampleanumberbetween0and99.9:thisisthepercentageofthedatasettoberan-domlyselectedformodelbuilding(notforout-of-bagtypeevaluation).seedanintegerfortherandomseed(intheCcode)labelalabelfortheoutcome(whenprintingrules)ValueAlistcontainingtheoptions.Author(s)MaxKuhnReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993)pp.236-243Quinlan.C4.5:ProgramsForMachineLearning(1993)MorganKaufmannPublishersInc.SanFrancisco,CAhttp://rulequest.com/cubist-info.htmlSeeAlsocubist(),predict.cubist(),summary.cubist(),predict.cubist(),dotplot.cubist()ExamplescubistControl() 6dotplot.cubist dotplot.cubistVisualizationofCubistRulesandEquations DescriptionLatticedotplotsoftheruleconditionsorthelinearmodelcoefcientsproducedbycubist()objectsUsage##S3methodforclass'cubist'dotplot(x,data=NULL,what="splits",committee=NULL,rule=NULL,...)Argumentsxacubist()objectdatanotcurrentlyused(hereforlatticecompatibility)whateither"splits"or"coefs"committeewhichcommitteestoplotrulewhichrulestoplot...optionstopasstolattice::dotplot()DetailsForthesplits,apaneliscreatedforeachpredictor.Thex-axisistherangeofthepredictorscaledtobebetweenzeroandoneandthey-axishasalineforeachrule(withineachcommittee).Areasarecoloredasbasedontheirregion.Forexample,ifonerulehasvar110,thelinearforthisrulewouldbecolored.Ifanotherrulehadthecomplementaryregionofvar110,itwouldbeonanotherlineandshadedadifferentcolor.Forthecoefcientplot,anotherdotplotismade.Thelayoutisthesameexceptthethex-axisisintheoriginalunitsandhasadotiftheruleusedthatvariableinalinearmodel.Valuealattice::dotplot()objectAuthor(s)RcodebyMaxKuhn,originalCsourcesbyRQuinlanandmodicationsbeSteveWestonReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993)pp.236-243 exportCubistFiles7Quinlan.C4.5:ProgramsForMachineLearning(1993)MorganKaufmannPublishersInc.SanFrancisco,CAhttp://rulequest.com/cubist-info.htmlSeeAlsocubist(),cubistControl(),predict.cubist(),summary.cubist(),predict.cubist(),lattice::dotplot()Exampleslibrary(mlbench)data(BostonHousing)##1committeeandnoinstance-basedcorrection,sojustanM5fit:mod1cubist(x=BostonHousing[,-14],y=BostonHousing$medv)dotplot(mod1,what="splits")dotplot(mod1,what="coefs")##Nowwith10committeesmod2cubist(x=BostonHousing[,-14],y=BostonHousing$medv,committees=10)dotplot(mod2,scales=list(y=list(cex=.25)))dotplot(mod2,what="coefs",between=list(x=1,y=1),scales=list(x=list(relation="free"),y=list(cex=.25))) exportCubistFilesExportCubistInformationTotheFileSystem DescriptionForattedcubistobject,textlesconsistentwiththeRuleQuestcommand-lineversioncanbeexported.UsageexportCubistFiles(x,neighbors=0,path=getwd(),prefix=NULL)Argumentsxacubist()objectneighborshowmany,ifany,neighborsshouldbeusedtocorrectthemodelpredictionspaththepathtoputthelesprefixaprex(or"lestem")forcreatingles 8predict.cubistDetailsUsingtheRuleQuestspecications,model,namesanddatalesarecreatedforusewiththecommand-lineversionoftheprogram.ValueNovalueisreturned.Threelesarewrittenout.Author(s)MaxKuhnReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993)pp.236-243Quinlan.C4.5:ProgramsForMachineLearning(1993)MorganKaufmannPublishersInc.SanFrancisco,CAhttp://rulequest.com/cubist-info.htmlSeeAlsocubist(),predict.cubist(),summary.cubist(),predict.cubist()Exampleslibrary(mlbench)data(BostonHousing)mod1cubist(x=BostonHousing[,-14],y=BostonHousing$medv)exportCubistFiles(mod1,neighbors=8,path=tempdir(),prefix="BostonHousing") predict.cubistPredictmethodforcubistts DescriptionPredictionusingtheparametricmodelarecalculatedusingthemethodofQuinlan(1992).Ifneighborsisgreaterthanzero,thesepredictionsareadjustedbytrainingsetinstancesnearbyusingtheapproachofQuinlan(1993). predict.cubist9Usage##S3methodforclass'cubist'predict(object,newdata=NULL,neighbors=0,...)Argumentsobjectanobjectofclasscubistnewdataadataframeofpredictors(inthesameorderastheoriginaltrainingdata)neighborsanintegerfrom0to9:howmanyinstancestousetocorrecttherule-basedprediction?...otheroptionstopassthroughthefunction(notcurrentlyused)DetailsNotethatthepredictionscanfailforvariousreasons.Forexample,asshownintheexamples,ifthemodelusesaqualitativepredictorandthepredictiondatahasanewlevelofthatpredictor,thefunctionwillthrowanerror.ValueanumericvectorisreturnedAuthor(s)RcodebyMaxKuhn,originalCsourcesbyRQuinlanandmodicationsbeSteveWestonReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993)pp.236-243Quinlan.C4.5:ProgramsForMachineLearning(1993)MorganKaufmannPublishersInc.SanFrancisco,CAhttp://rulequest.com/cubist-info.htmlSeeAlsocubist(),cubistControl(),summary.cubist(),predict.cubist(),dotplot.cubist()Exampleslibrary(mlbench)data(BostonHousing)##1committeeandnoinstance-basedcorrection,sojustanM5fit:mod1cubist(x=BostonHousing[,-14],y=BostonHousing$medv) 10summary.cubistpredict(mod1,BostonHousing[1:4,-14])##nowaddinstancespredict(mod1,BostonHousing[1:4,-14],neighbors=5)#Exampleerroriris_testirisiris_test$Speciesas.character(iris_test$Species)modcubist(x=iris_test[1:99,2:5],y=iris_test$Sepal.Length[1:99])#predict(mod,iris_test[100:151,2:5])#Error:#***line2of`undefined.cases':#badvalueof'virginica'forattribute'Species' summary.cubistSummarizingCubistFits DescriptionThisfunctionechoestheoutputoftheRuleQuestCcode,includingtherules,theresultinglinearmodelsaswellasthevariableusagesummaries.Usage##S3methodforclass'cubist'summary(object,...)Argumentsobjectacubist()object...otheroptions(notcurrentlyused)DetailsTheCubistoutputcontainsvariableusagestatistics.Itgivesthepercentageoftimeswhereeachvariablewasusedinaconditionand/oralinearmodel.Notethatthisoutputwillprobablybeinconsistentwiththerulesshownabove.Ateachsplitofthetree,Cubistsavesalinearmodel(afterfeatureselection)thatisallowedtohavetermsforeachvariableusedinthecurrentsplitoranysplitaboveit.Quinlan(1992)discussesasmoothingalgorithmwhereeachmodelpredictionisalinearcombinationoftheparentandchildmodelalongthetree.Assuch,thenalpredictionisafunctionofallthelinearmodelsfromtheinitialnodetotheterminalnode.ThepercentagesshownintheCubistoutputreectsallthemodelsinvolvedinprediction(asopposedtotheterminalmodelsshownintheoutput). summary.cubist11Valueanobjectofclasssummary.cubistwithelementsoutputatextstringoftheoutputcalltheoriginalcalltocubist()Author(s)RcodebyMaxKuhn,originalCsourcesbyRQuinlanandmodicationsbeSteveWestonReferencesQuinlan.Learningwithcontinuousclasses.Proceedingsofthe5thAustralianJointConferenceOnArticialIntelligence(1992)pp.343-348Quinlan.Combininginstance-basedandmodel-basedlearning.ProceedingsoftheTenthInterna-tionalConferenceonMachineLearning(1993)pp.236-243Quinlan.C4.5:ProgramsForMachineLearning(1993)MorganKaufmannPublishersInc.SanFrancisco,CAhttp://rulequest.com/cubist-info.htmlSeeAlsocubist(),cubistControl(),predict.cubist(),dotplot.cubist()Exampleslibrary(mlbench)data(BostonHousing)##1committeeandnoinstance-basedcorrection,sojustanM5fit:mod1cubist(x=BostonHousing[,-14],y=BostonHousing$medv)summary(mod1)##exampleoutput:##Cubist[Release2.07GPLEdition]SunApr1017:36:562011##---------------------------------####Targetattribute`outcome'####Read506cases(14attributes)fromundefined.data####Model:####Rule1:[101cases,mean13.84,range5to27.5,esterr1.98]####if##nox&#x--50;�0.668##then 12summary.cubist##outcome=-1.11+2.93dis+21.4nox-0.33lstat+0.008b##-0.13ptratio-0.02crim-0.003age+0.1rm####Rule2:[203cases,mean19.42,range7to31,esterr2.10]####if##nox0.668##lstat&#x=-50;�9.59##then##outcome=23.57+3.1rm-0.81dis-0.71ptratio-0.048age##-0.15lstat+0.01b-0.0041tax-5.2nox+0.05crim##+0.02rad####Rule3:[43cases,mean24.00,range11.9to50,esterr2.56]####if##rm6.226##lstat9.59##then##outcome=1.18+3.83crim+4.3rm-0.06age-0.11lstat-0.003tax##-0.09dis-0.08ptratio####Rule4:[163cases,mean31.46,range16.5to50,esterr2.78]####if##rm&#x=-50;�6.226##lstat9.59##then##outcome=-4.71+2.22crim+9.2rm-0.83lstat-0.0182tax##-0.72ptratio-0.71dis-0.04age+0.03rad-1.7nox##+0.008zn######Evaluationontrainingdata(506cases):####Average|error|2.07##Relative|error|0.31##Correlationcoefficient0.94######Attributeusage:##CondsModel####80%100%lstat##60%92%nox##40%100%rm##100%crim##100%age##100%dis##100%ptratio##80%tax##72%rad##60%b summary.cubist13##32%zn######Time:0.0secs IndexTopichplotdotplot.cubist,6Topicmodelscubist.default,2exportCubistFiles,7predict.cubist,8summary.cubist,10TopicutilitiescubistControl,4caret::train(),3,4cubist(cubist.default),2cubist(),5–11cubist.default,2cubistControl,4cubistControl(),2,4,7,9,11dotplot.cubist,6dotplot.cubist(),4,5,9,11exportCubistFiles,7lattice::dotplot(),6,7predict.cubist,8predict.cubist(),3–5,7–9,11summary.cubist,10summary.cubist(),3–5,7–914