/
ModelAveragingandModelSelectionafterMultipleImputationusingtheRpackag ModelAveragingandModelSelectionafterMultipleImputationusingtheRpackag

ModelAveragingandModelSelectionafterMultipleImputationusingtheRpackag - PDF document

ximena
ximena . @ximena
Follow
345 views
Uploaded On 2021-06-19

ModelAveragingandModelSelectionafterMultipleImputationusingtheRpackag - PPT Presentation

NEWspeedupestimationusingparallelizationseeSection61 Contents1Background211Installation212Citation213Motivation ID: 845711

add criterion mami bic criterion add bic mami factor prostate model stage hospital period org method var cd4 2012

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "ModelAveragingandModelSelectionafterMult..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 ModelAveragingandModelSelectionafterMult
ModelAveragingandModelSelectionafterMultipleImputationusingtheR-packageMAMIVersion:May6,2019Author:MichaelSchomaker1,withsupportfromChristianHeumann NEW:speedupestimationusingparallelization,seeSection6.1. Contents1Background21.1Installation......................................21.2Citation........................................21.3Motivation......................................31.3.1ModelAveraging...............................31.3.2MultipleImputation.............................31.3.3ModelAveraging(orModelSelection)afterMultipleImputation....42ModelChoice52.1ImputationModel..................................52.2AnalysisModel....................................63ChoiceofModelAveraging(orSelection)Method73.1ModelAveraging...................................73.1.1CriterionBasedModelAveraging......................73.1.2Mallow'sModelAveraging..........................83.1.3LassoAveraging...............................103.1.4JackknifeModelAveraging.........................113.2ModelSelection....................................123.2.1CriterionBasedModelSelection......................123.2.2LASSOandShrinkageBasedModelSelection...............134Inference135AnalysisandInterpretation145.1Example........................................145.2SuggestedReportinginPublications........................176Miscellaneous206.1ComputationTime..................................206.2LimitationsoftheApproach.............................226.3OptimalModelAveragingforPrediction:SuperLearning............226.4Miscellaneous.....................................23Index23 1UniversityofCapeTown,CentreforInfectiousDiseaseEpidemiologyandResearch,Observatory,7925;CapeTown,SouthAfrica,michael.schomaker@uct.ac.za1 1BackgroundTheR-packageMAMIo ersanimplementationofthemethodologyproposedinSchomaker,M.andC.Heumann(2014).Modelselectionandmodelaveragingaftermul-tipleimputation.ComputationalStatisticsandDataAnalysis71,758{770.Itisessentiallyasummaryoffunctionsusedinthepaper,withsomeusefulextensions.Themainfunctionofthepackage(mami())performsmodelselection/averagingonmultiplyimputeddatasetsandcombinestheresultingestimates.Thepackagealsoprovidesaccesstolessfrequentlyusedmodelaveragingtechniquesando ersintegr

2 atedbootstrapestimation.Thepackageisusef
atedbootstrapestimation.Thepackageisusefulifonewantstoperformmodelselectionormodelaveragingonmultiplyimputeddataandtheanalysismodelofinterestiseitherthelinearmodel,thelogisticmodel,thePoissonmodel,ortheCoxproportionalhazardsmodel,possiblywitharandomintercept.onewantstoobtainbootstrapcon denceintervalsformodelselectionormodelaverag-ingestimators(withorwithoutmissingdata/imputation){toaddressmodelselectionuncertaintyandtodiscoverrelationshipsofsmalle ectsize,seeTable1inSchomakerandHeumann(2014).onewantstocomparedi erentmodelselectionandaveragingtechniques,easilywiththesamesyntax.Thepackageisoflimiteduseunderthefollowingcircumstances:ifoneisinterestedinmodelselectionoraveragingformodelsotherthanthoselistedabove,forexampleparametricsurvivalmodels,additivemodels,time-seriesanalysis,andmanyothers.ifonedecidesforaspeci cmodelselectionoraveragingtechniquenotprovidedbythepackage,seeSection3formoredetails.ifthemodelselection/averagingproblemiscomputationallytoointensive,seeSection6.1formoredetails.1.1InstallationThepackagecanbedownloadedatR�forge:http://mami.r-forge.r-project.org/https://r-forge.r-project.org/R/?group_id=2152Or,simplytype: install.packages("MAMI",repos=c("http://R-Forge.R-project.org","http://cran.at.r-project.org"),dependencies=TRUE) Thelatteroptionisrecommendedasmami()dependsonacoupleofotherpackages.1.2CitationIfyouuseMAMI,pleaseciteit.Forsuggestedcitationsuse citation("MAMI")print(citation("MAMI"),bibtex=T) 2 Itisalwaysrecommendedtocitethemethodologicalreference:Schomaker,M.andC.Heumann(2014).Modelselectionandmodelaveragingaftermul-tipleimputation.ComputationalStatisticsandDataAnalysis71,758{770.Additionally,thepackageandthismanualcanbecitedaswell:Schomaker,M.(2017).MAMI:Modelaveraging(andmodelselection)aftermultipleImpu-tation.Rpackageversion0.9.12.;http://mami.r-forge.r-project.org/Schomaker,M.andC.Heumann(2017).ModelaveragingandmodelselectionaftermultipleimputationusingtheR-packageMAMI;http://mami.r-forge.r-project.org/1.3Motivation1.3.1ModelAveragingThemotivationforvariableselectioninregressionmodelsisbasedontherationalethatassoci-ationalrelationshipsbetweenvariablesarebestunderstoodbyreducingthemodel'sdim

3 ension.Theproblemwiththisapproachisthat(
ension.Theproblemwiththisapproachisthat(i)regressionparametersaftermodelselectionareof-tenbiasedand(ii)therespectivestandarderrorsaretoosmallbecausetheydonotre ecttheuncertaintyrelatedtothemodelselectionprocess(LeebandPotscher,2005;BurnhamandAnderson,2002).Ithasbeenproposed(Chat eld,1995;Draper,1995;Hoetingetal.,1999)thatthedrawbackofmodelselectioncanbeovercomebymodelaveraging.Withmodelaveraging,onecalculatesaweightedaverage^ =Pw^ fromthekparameterestimates^ (=1;:::;k)ofasetofcandidate(regression)modelsM=fM1;:::;Mkg,wheretheweightsarecalculatedinawaysuchthat`better'modelsreceiveahigherweight.ApopularweightchoicewouldbebasedontheexponentialAIC,wAIC=exp(�1 2AIC) Pk=1exp(�1 2AIC);(1)whereAICistheAICvaluerelatedtomodelM2M(Bucklandetal.,1997)andPwAIC=1.Ithasbeensuggested2toestimatethevarianceofthescalar^ j2^ viadVar(^ j)=(kX=1wq dVar(^ j;jM)+(^ j;�^ j)2)2;(2)where^ j;isthejthregressioncoecientofthethcandidatemodel.Thisapproachtacklesproblem(ii),theincorporationofmodelselectionuncertaintyintothestandarderrorsoftheregressionparameters;butitmaynotnecessarilytackleproblem(i)astheregressionparam-etersmaystillbebiased.Therearemultipledi erentsuggestionsonhowtheweightscanbecalculated,andthoseimplementedinmami()areexplainedinSection3.Notethatmodelselectioncanbeviewedasaspecialcaseofmodelaveragingwherethe\best"modelreceivesweight1(andallothersaweightof0).AllimplementedmodelselectionoptionsarelistedinSection3too.1.3.2MultipleImputationMultipleimputation(MI)isapopularmethodtoaddressmissingdata.Basedonassumptionsaboutthedatadistribution(andthemechanismwhichgivesrisetothemissingdata)missing 2Whileformula(2)fromBucklandetal.(1997)isthemostpopularchoicetocalculatestandarderrorsinmodelaveraging,ithasalsobeencriticizedthatthecoverageprobabilityofintervalestimatesbasedon(2)canbebiased(HjortandClaeskens,2003).3 valuescanbeimputedbymeansofdrawsfromtheposteriorpredictivedistributionoftheunobserveddatagiventheobserveddata.ThisprocedureisrepeatedtocreateMimputeddatasets,the(regression)analysisisthenconductedone

4 achofthesedatasetsandtheMresults(Mpointa
achofthesedatasetsandtheMresults(MpointandMvarianceestimates)arecombinedbyasetofsimplerules:^ MI=1 MMXm=1^ (m)(3)anddCov(^ MI)=cW+M+1 M^B=1 MMXm=1dCov(^ (m))+M+1 M(M�1)MXm=1(^ (m)�^ MI)(^ (m)�^ MI)0(4)where^ (m)referstotheestimateof inthemthimputedsetofdataD(m),m=1;:::;M,cW=M�1PmdCov(^ (m))istheaveragewithinimputationcovariance,and^B=(M�1)�1Pm(^ (m)�^ MI)(^ (m)�^ MI)0thebetweenimputationcovariance.Con denceintervalsareconstructedonatR-distributionwithapproximatelyR=(M�1)[1+fM^W=(M+1)^Vg]2degreesoffreedom(RubinandSchenker,1986),thoughtherearealternativeapproximations,especiallyforsmallsamples(Lipsitzetal.,2002).MoredetailsonimputationcanbefoundinRubin(1996)andWhiteetal.(2011),amongothers.1.3.3ModelAveraging(orModelSelection)afterMultipleImputationHowcanmodelaveragingandmodelselectionbeappliedtomultiplyimputeddata?ThedetailedmotivationcanbefoundinSchomakerandHeumann(2014).Thebasicresultsformodelaveragingare^ MI=1 MMXm=1^ (m)with^ (m)=kX=1w(m)^ (m)(5)andappliestoanyweightchoice.Ifthevarianceofthemodelaveragingestimatorisestimatedvia(2),theoverallvarianceoftheestimatoraftermultipleimputationrelatestodVar(^ j;MI)=1 MMXm=1(kX=1wq dVar(^ (m)j;)+(^ (m)j;�^ (m)j)2)2+M+1 M(M�1)MXm=1(^ (m)j�^ j;MI)2:(6)Con denceintervalscouldthenagainbeestimatedbasedonatRdistribution(asexplainedabove)or,alternatively,viabootstrapping{seeSection4,andTable1inSchomakerandHeumann(2014),formoredetails.Modelselectionafterimputationworksessentiallythesame,exceptthatparametersasso-ciatedwithvariableswhichhavenotbeenselectedareassumedtobe0.Withthisassumption,avariablewillbeformallyselectedifitisselectedinatleastoneimputedsetofdata,butitsoverallimpactwilldependonhowoftenitischosen.Here,con denceintervalswillalmostal-waysbetoonarrowif(6)isapplied(becauseofmodelselectionuncertainty)andbootstrappingisrecommended.Asaconsequenceformodelselection(andmodelaveraging),e ectsofvariableswhicharenotsupportedthroughoutimputeddatasets(andcandidatemodels)willsimplybelesspro-nounced.4 MAMIestimatesthepo

5 intestimates(5),togetherwithcon denc
intestimates(5),togetherwithcon denceintervalsthatareeitherbasedon(6),oraBayesianvariationthereof(Section3.1.1),orbootstrapping(preferredoption,Section4).Inaddition,avariableimportancemeasure(averagedovertheMimputeddatasets)willbecalculated:thismeasure,seealsoBurnhamandAnderson(2002),simplysumsuptheweightswofthosecandidatemodelsMthatcontaintherelevantvariable,andliesbetween0(unimportant)and1(veryimportant).ItissimilartotheBayesianposteriore ectprobability.ResultscouldbeinterpretedandreportedassuggestedinSection5. MAMI'smainfunctionismami().Itisrecommendedtogetfamiliarwiththefunction'ssyntaxbytyping?mamiandrunningtheexamplesatthebottomofthehelppage.Brie y,thesyntaxismami(X,missing.data=c("imputed","none","CC"),model=c("gaussian","binomial","poisson","cox"),outcome=1,id=NULL,method=c("MA.criterion","MS.criterion","LASSO","LAE","MMA"),criterion=c("AIC","BIC","BIC+","CV","GCV"),kfold=5,cvr=F,inference=c("standard","+boot"),B=20,X.org=NULL,CI=0.95,var.remove=NULL,aweights=NULL,add.factor=NULL,add.interaction=NULL,add.transformation=NULL,add.stratum=NULL,ncores=1,candidate.models=c("all","restricted","veryrestricted"),screen=0,report.exp=FALSE,print.time=FALSE,print.warnings=TRUE,...) ?mamiexample(mami) Usingtheaboveframeworkrequiresdecisionswithrespecttothefollowing:Adecisionabouttheimputationprocedure,seeSection2.1.{thisdecisionisbeingmadebeforeapplyingmamiAdecisionaboutthe(regression)analysismodel,seeSection2.2.{thisisbeingspeci edbymeansofthefollowingoptions:model,idaswellassome-timesvar.remove,outcome,aweights,add.factor,add.interaction,add.-transformation,add.stratum,screen.Adecisionaboutthemodelselectionormodelaveragingapproach,i.e.thedeterminationoftheweightsandthesetofcandidatemodels,seeSection3.{thisisbeingspeci edbymeansofthefollowingoptions:method,criterionaswellassometimeskfold,candidate.models.Adecisionaboutcon denceintervalcalculation,seeSection4.{thisisbeingspeci edbymeansofthefollowingoptions:inferenceaswellassometimesB,X.org,CI.2ModelChoice2.1ImputationModelMAMIdoesnotimpute.TheuserwouldtypicallyprovidethedataXinoneofthefollowingformats:5 Anobjectofclassamelia,generatedbyimputationwithAmeliaII[auth

6 or'srecom-mendation].SeeHonakerandKing(2
or'srecom-mendation].SeeHonakerandKing(2010)andHonakeretal.(2011)formoredetails.Anobjectofclassmids,generatedbyimputationwithmice.See,forexample,vanBuurenandGroothuis-Oudshoorn(2011)formoredetails.Alistofdatasetsimputedwithanyotherpackageorsoftware.Withthischoicebootstrapcon denceintervals,asexplainedinSection4,cannotbecalculated.IftheuserhasusedothersoftwarethanRtogeneratetheimputations,oneoptionwouldbetoi)savetheimputeddatasetsas.csv les,ii)importthemtoRwithread.csv(),andtheniii)createalistofthesedatasetswhichcanthenbeusedbymami().Asingledataframe.Thiscanbeofinterestwhen{oneisinterestedinacomparisonwithresultsfromacompletecaseanalysis(missing.data="CC"),{orifonehasnomissingdata(missing.data="none")butisinterestedinpostmodelselection/averagingintervalsusingbootstrapping.2.2AnalysisModelTheanalysismodelcan(currently)beanyofthefollowing:thelinearregressionmodel,thelogisticregressionmodel,thePoissonregressionmodel,ortheCoxproportionalhazardsre-gressionmodel.Thus, referstotheparametervectorofageneralizedlinearmodelorsurvivalmodel.Itisbeingspeci edwiththeoptionmodelandcanbeeither"gaussian","binomial","poisson",or"cox".Ifthedataarenotcross-sectional,butlongitudinal,itispossibletospecifythiswithavariablewhichindicatesthecross-sectionalunit,i.e.viatheoptionid.Then,a(generalized)linearmixedmodelwithrandominterceptis t[oraCoxmodelwithfrailty].Whendealingwithlongitudinaldatathefollowingneedstobeconsidered:theimputationmodelneedstoaccountforthelongitudinalstructureofthedata.Forlongitudinaldata,thereareonlysemi-satisfactoryimputationproceduresavailableatthemoment.Thebestmightbetousethe\timeseries-crosssection"facilitiesinAmeliaII,speci edwiththets,csandintercsoptions;seeHonakeretal.(2011,Section4.5)formoredetails.Thisapproachoftenworkswell,thoughitdoesnotexplicitlytakeintoaccountthetime-orderinginthedata.Alternatively,iftherearenotmanytimepoints,reshapingthedatawithreshape(),fromlongtowideformat,couldbeanoptioninsomeapplications.Thebootstrapneedstobefacilitatedontheid-level.mami()doesthis. Thefullanalysismodel(beforemodelselectionoraveraging)istheonespeci edviamodelandid.ItincludesallvariablesinthedatasetX,andusesthe rstcol

7 umnastheoutcomevariable,unlessanyofthebe
umnastheoutcomevariable,unlessanyofthebelowoptionsarespeci ed. var.remove{eitheravectorofcharacterstringsorintegers,specifyingthevariablesorcolumnswhicharepartofthedatabutshouldnotbeconsideredinthemodelselec-tion/averagingprocedure.screen{numberofvariableswhichshouldberemovedinaninitialscreeningstepwithLASSO.6 outcome{acharactervectororintegerspecifyingthevariableorcolumnwhichshouldbetreatedasoutcomevariable.Forsurvivalmodels,twovariables(timetoevent,event)needtobespeci ed.add.transformation{avectorofcharacterstrings,specifyingtransformationsofvari-ableswhichshouldbeaddedtotheanalysismodels.add.interaction{alistofeithercharacterstringsorintegers,specifyingthevariableswhichshouldbeaddedasinteractionsintheanalysismodel.add.factor{alistofeithercharacterstringsorintegerswhichindicatesthe(categorical)variablesthatshouldbetreatedasafactor.Rarelyneededasmami()alreadysearchesforexistingfactorvariables.aweights{aweightvectortobeusedintheanalysismodel.add.stratum{acharactervectororintegerspecifyingthevariableusedasastratuminCoxregressionanalyses.Example2in?mamishowshowtheseoptionscouldbeused.FuturereleasesofMAMIarelikelygoingtocontaintheoptiontospecifyquasi-likelihoodmodels.Otheranalysismodels,likeGEEsorparametricsurvivalmodelsmaybeo eredtoo;however,mixedmodelswithrandomslopeoradditivemodelsnot,becausethebestuseofmodelaveraginginthiscontextisnotclearyet.3ChoiceofModelAveraging(orSelection)MethodThemodelaveraging(selection)methodischosenbythecombinationofthemethodandcriterionoptions.3.1ModelAveraging3.1.1CriterionBasedModelAveragingCriterionbasedmodelaveragingmeansessentiallyusingtheweights(1),withanyinformationtypecriterion.Thiscanbeutilizedbypicking:method="MA.criterion"andeithercriterion="AIC"orcriterion="BIC"orcriterion="BIC+".ModelaveragingisutilizedwiththepackageMuMIn(Barton,2017)forcriterion="AIC/BIC"andwithBMA(Rafteryetal.,2017)forcriterion="BIC+".MuMInevaluates(bydefault)allpossiblecandidatemodelsM(i.e.2pforpvariables),whereasBMAusesasubsetofmodelsbasedonaleapsandboundsalgorithminconjunctionwith\Occam'srazor",seeHoetingetal.(1999)formoredetails.Thereareseveralimplicationsbyusingtheabovemethodology:Withmanyvariable

8 s,thecomputationtimecanbe(too)largeif\al
s,thecomputationtimecanbe(too)largeif\all"candidatemod-elsareevaluated.Asolutiontothisisrestrictingthenumberofcandidatemodelsand/orparallelization.Section6.1givesanoverviewonhowthiscanbefacilitated.Brie y,byi)restrictingcandidatemodelsbyaccessingdredgefromMuMIn,ii)orusingcriterion="BIC+"[thenumberofcandidatemodelsisprinted],oriii)usingcriterion="BIC+"andaccessingbic.survorbic.glmfromBMAtoadjustOccam'swindow,iv)usingtheop-tioncandidate.models,byv)screeningvariableswithoptionscreen,orvi)byparallelcomputingusingoptionncores.7 Ifthereisnoproperlikelihoodfunction,thenitcanbearguedthatusinganinformationtypecriterion,andthereforecriterionbasedmodelaveraging,doesnotmakesense.Inparticular,survivalmodels,suchasCox'sproportionalhazardsmodel,useapartiallikelihood,duetothecensoringinherentinsurvivaldata.Nevertheless,apragmaticsolutionistosimplyadoptthepartiallikelihood;ifBICisused,onehastodecidewhatnmeans,i.e.thenumberofsubjectsorthenumberofuncensoredsubjects,seeHoetingetal.(1999)andVolinskyetal.(1997)formoredetails.Similarly,thede nitionofinformationcriteriainmixedmodelsisnotentirelyclearandweusetheimplementationfrompackagelme4.FromtheBayesianperspectivethequalityofaregressionmodelM2Mmaybejudgeduponitsestimatedposteriorprobabilitythatthismodeliscorrect,thatisPr(Mjy)/Pr(M)ZPr(yjM; )Pr( jM)d ;wherePr(M)isthepriorprobabilityforthemodelMtobecorrect,Pr(yjM; )=L( )representsthemaximizedlikelihood,andPr( jM)re ectsthepriorof formodelM.Since,foralargesamplesize,Pr(Mjy)canbeapproximatedviatheBayes-CriterionofSchwarz(BCS,BIC),itisoftensuggestedthattheweight(1)isusedfortheconstructionoftheBayesianModelAveragingestimator,butwithBIC,insteadofAIC.TheBCScorrespondsto�2L(^ )+lnnp,wherepcorrespondstothenumberofparameters.Notethefollowing:{TheBMAimplementationdoesnotallowthespeci cationofPr(M)andassumesequalpriorprobabilitiesforeachmodeltobecorrect.{Broadly,usingoptionBIC+usesvarianceestimationbasedonvariancedecomposi-tion(Draper,1995)suchasthelawoftotalvariance,i.e.usingdVar(^ j)=bEM(dVar(^ j;jy;M))+dVarM(bE(^ j;jy;M

9 ))(7)whichissimilar,butnotidentical
))(7)whichissimilar,butnotidenticalto(2).Thismeansstandarderrorsandcon denceintervalsarenotidenticalwhencomparingoptionsBICandBIC+{evenifthecandidatemodelsarethesame(whichpracticallyneverhappens).However,inmostsituationsthecon denceintervalswillbeveryclose.{UsingBIC+andpassingontheoptionOR=Inf(tobic.glmorbic.surv)meanscon-sideringallcandidatemodels.CanbeofinterestwhenconsideringafullBayesianapproach.{ForBIC+,thevariableimportancemeasuredisplayedbymamiessentiallyreferstotheposteriore ectprobability(averagedovertheimputeddatasets)calculatedbyBMA;seeHoetingetal.(1999),amongothers,formoredetails.Thenumberofcandidatemodelsis2p�1iftheCoxmodelisttheanalysismodelofinterest.Thisisbecausethe\interceptonly"modelisnotevaluated,asthereisnonaturalintercept(estimationofthebaselinehazardistreatedasanuisanceparameter).3.1.2Mallow'sModelAveragingMallow'smodelaveraging(MMA)referstotheapproachdescribedbyHansen(2007).Thisestimatorisanexampleofoptimalmodelaveraging,whichmaybeofparticularinterestfromapredictivepointofview,ratherthanexplanatorypointofview.Itisimplementedinthefunctionsmma()andjma()3andcanbeusedwithinmami()whentheoptionmethod="MMA"ischosen.Itcanonlybeusedforthelinearmodel. 3Theoriginalimplementationinmmaisalmostidenticaltothemorerecentversioninjma,wherethelatterisarobustexpansionofBruceHansen's leathttp://www.ssc.wisc.edu/~bhansen/progs/ecnmt_07.htmlandallowsforanon-nestedmodelsetup(wherecomputationallyfeasible).However,theformerisabitmorestableandallowsvarianceestimationaccordingto(2)whenusingoptionvariance="BA".8 Hansenconsidersasituationofknestedlinearregressionmodelsforkvariables.Let^ =Pk=1w^ beamodelaveragingestimatorwith^w=Xk^ .BasedonsimilarthoughtsasintheconstructionofMallow'sCp,Hansensuggeststominimizethemeansquared(prediction)errorbyminimizingthefollowingcriterion:~Cp=(y�Xk^ )0(y�Xk^ )+22Kw;(8)whereKw=tr(Pw),Pw=Pk=1wP,P=X(X0X)�1X0,and2isthevariancewhichneedstobeestimatedfromthefullmodel.Consequently,theweightsarechosensuchthat~CpisminimizedwMMA=argminw2H~Cp;(9)withH=f(w1;:::;wk)2[0;1]k:Pk=1w=1g.Sinceth

10 e rstpartof(8)isquadraticinwand
e rstpartof(8)isquadraticinwandthesecondonelinear,onecanobtainthemodelaveragingestimatorbymeansofquadraticprogramming(i.e.thepackagequadprog).Theassumptionsofadiscreteweightsetandnestedregressionmodelssoundrestrictive,butithasbeenshownthatbothassumptionsarenotnecessarilyrequiredandMMAcanbeappliedtonon-nestedregressionmodelsaswell;giventhatthisiscomputationallyfeasible(Wanetal.,2010).Notethefollowing:TheMMAestimatorisbasedontheweights(9)ratherthan(1).Usingmamiwithmethod="MMA"usesthefunctionjma4.Bydefault,no standarderrorsandcon denceintervalsarecalculated.ThisisbecausethemotivationforMallow'smodelaveragingisprediction(SchomakerandHeumann,2019).Ifoneisinterestedinbootstrapstandarderrors(whichmaynotachievenominalcoverage),thentheoptionscalc.var="boot"andbsa(forthenumberofbootstrapsamples)canbepassedontojma.Forexample,inExample1inthemamihelp leonecouldspecify: mami(freetrade imp,method="MMA",outcome="tariff",add.factor=c("country"),calc.var="boot",bsa=200) Themmaandjmafunctionscanbecomparedeasily.Anexampleis data(Prostate)jma(y=Prostate[,9],x=Prostate[,-9],ma.method="MMA")mma(Prostate,formula=lpsa.,ycol="lpsa") IntheabovedescribednestedmodelsetuptherearekcandidatemodelsandtheMMAestimatesdependupontheorderingoftheseregressors.Thenon-nestedsetupallowsfortheevaluationall2kcandidatemodelsandisnota ectedbytheorderingoftheregressors;however,MMAismuchmoreunstableundersuchasetupandforlargekestimationmaynotbecomputationallyfeasible.Thedefaultinjmaistousethenestedsetup(model.subset="nested"),butitisalsopossibletouseallcandidatemodelsforweightcalculation(model.subset="nested"): jma(y=Prostate[,9],x=Prostate[,-9],ma.method="MMA",model.subset="all") Thefunctionmmaallowsonlyevaluationofthenestedsetup.Eveninthissimplersetting,matriceswhicharenotpositivede niteyieldtothefailureoftheoptimizationproblem. 4sinceversion0.9.139 Bothmmaandjmapreventthisbylookingfora\close"positivede nitematrixthatcanbeused,basedonmake.positive.definitefrompackagecorpcor,andawarningisprinted.3.1.3LassoAveragingShrinkageestimation,forexampleviatheLASSO(Tibshirani,1996),canbeusedformodelselection.Thisrequiresthechoiceofatuningparameterwhichcomeswithtuningparameters

11 electionuncertainty.LASSOaveragingestima
electionuncertainty.LASSOaveragingestimation(LAE),ormoregeneralshrinkageaveragingestimation(Schomaker,2012),isawaytocombineshrinkageestimatorswithdi erenttuningparameters.ThisisimplementedinMAMIinthelaefunctionandcanbeusedinmami()bycallingtheoptionmethod="LAE".ConsidertheLASSOestimatorforasimplelinearmodel:^ LE()=argmin8:nXi=1(yi� 0�pXj=1xij j)2+pXj=1j jj9=;:(10)Thecomplexityparameter0tunestheamountofshrinkageandistypicallyestimatedviathegeneralizedcrossvalidationcriterion(GCV)oranyothercrossvalidationcriterion(CV).Thelargerthevalueof,thegreatertheamountofshrinkagesincetheestimatedcoecientsareshrunktowardszero.Considerasequenceofcandidatetuningparameters=f1;:::;Lg.Ifeachestimator^ LE(i)obtainsaspeci cweightwi,thenaLASSOaveragingestimatortakestheform^ LAE=LXi=1wi^ LE(i)=w^BLE;(11)wherei2[0;c],c&#x]TJ ;� -1; .63; Td;&#x [00;0isasuitableconstant,^BLE=(^ LE(1);:::;^ LE(L))0istheLpmatrixoftheLASSOestimators,w=(w1;:::;wL)isan1Lweightvector,w2WandW=fw2[0;1]L:10w=1g.Onecouldchoosetheweights^wOCV=argminw2WOCVk(12)withOCVk=1 n~(w)0~(w)/wE0kEkw0;(13)referringtoanoptimalcrossvalidation(OCV)basedcriterionandEk=(~k(1);:::;~k(L))isthenLmatrixofthecross-validationresidualsfortheLcompetingtuningparameters(givenaspeci clossfunction).MoredetailscanbefoundinSchomaker(2012).UsingLAEwithmethod="LAE"hasthefollowingimplications:LAEcanbeusedforthelinearmodel,thelogisticmodelandthePoissonmodel{butnotforlongitudinalorsurvivaldata.Themodelaveragingestimatorisbasedontheweights(12)ratherthan(1).ThevarianceoftheLAEestimatoriscalculatedaccordingto(2),butthecandidatemodelsrefertothedi erentchoicesofthetuningparameter.ThevarianceofeachsingleLASSOestimator^ LE(i)isbasedonbootstrapping.Thedefaultis100bootstrapsamplesasnotforallproblemscon denceintervalsoftheLASSOareofinterest.TheoptionB.var(inlae,orcalledfrommami)canbeusedtoadjustthenumberofbootstrap10 samples.NotethattheLASSOestimatorisashrinkageestimatorwhichtradesdecr

12 easedvariancewithincreasedbias.Therefore
easedvariancewithincreasedbias.Thereforesimplebootstrapbasedcon denceintervalswillnotachievenominalcoverage(awarningisprinted),andusingitformodelaveraging(anothershrinkageestimator)willnotchangethis.TheLASSOimplementationcv.glmnetfrompackageglmnetisused.Therefore,anyargumentscanpassedtothisfunction:forexamplealpha(tocalltheElasticNetorRidgeestimator[seeExample4,?mami];orthesequence(nlambda).Thekofk-foldcross-validation(forthede nitionofOCVk)canbespeci edwiththeoptionkfold[bothinlaeandmami].Thedefaultsplitofthedataforcrossvalidationisbasedonthesequence1;2;:::;k;1;2;:::.Theoptioncvr=Trandomlyshuesthesplit.Thedefaultcvr=Fismeanttomakeresultsreproducibleandcomparablewhenusingbothmethod="LAE"andmethod="LASSO".NotethatthedevianceisusedasthelossfunctiontocalculatethecrossvalidationresidualsforlogisiticandPoissonregressionmodels.However,theweightsforLassoaveraginghavetobedeterminedbasedonasquaredlossfunction.Example3under?mamiaswellastheexamplesunder?laegiveexamplesoftheuseofLAE.3.1.4JackknifeModelAveragingJackknifeModelAveraging(JMA),assuggestedinHansenandRacine(2012)forlinearmodels,hasbeenimplementedinthefunctionjmaandbuildsonleave-one-out(LOO)crossvalidation.JMAisimplementedinMAMIinthejmafunctionandcanbeusedinmami()bycallingtheoptionmethod="JMA".JMAworksasfollows:forModelMtheLOOresidualvectoris~=y�^y,with^yi=xi(X0(�i)X(�i))�1X0(�i)y(�i)wheretheindex(�i)describesthattherespectivematrixexcludesobservationi,i=1;:::;n.ItcanbeshownthatthereisasimplealgebraicrelationshipwhichallowsthecomputationoftheLOOresidualsinoneratherthannoperations:~=D^(14)where^isthestandardleastsquaresresidualvectory�PywiththehatmatrixP=X(X0X)�1X0;andDisanndiagonalmatrixwithDii;=(1�Pii;)�1,i=1;:::;n.Forkcandidatemodels,whichmaybenestedornot(asintheMMA-setupdescribedinSection3.1.2),thelinearweightedLOOresidualsare~w=Pw~,=1;:::;k.AnestimateofthetrueexpectedsquarederrorisCVw=n�1~0w~wandanappropriateweightchoicewouldthusbewJMA=argminw2HCVw;(15)AswithMMA,theweightscanbeobtainedwithquadraticp

13 rogramming.Similarconsid-erationsaswithM
rogramming.Similarconsid-erationsaswithMMAapply:TheJMAestimatorisbasedontheweights(15)ratherthan(1).Usingmamiwithmethod="JMA"usesthefunctionjma.Bydefault,no standarderrorsandcon denceintervalsarecalculated.ThisisbecausethemotivationforJackknifemodelaveragingisprediction(SchomakerandHeumann,2019).Ifoneisinterestedinbootstrapstandarderrors(whichmaynotachievenominalcoverage),thentheoptionscalc.var="boot"andbsa(forthenumberofbootstrapsamples)canbepassedontojma(seeexampleinSection3.1.2).11 InthenestedmodelsetuptherearekcandidatemodelsandtheJMAestimatesdependupontheorderingoftheseregressors.Thenon-nestedsetupallowsfortheevaluationall2kcandidatemodelsandisnota ectedbytheorderingoftheregressors;however,JMAismuchmoreunstableundersuchasetupandforlargek(say�20)estimationmaynotbecomputationallyfeasible.Thedefaultinjmaistousethenestedsetup(model.subset="nested"),butitisalsopossibletouseallcandidatemodelsforweightcalculation(model.subset="nested"),seeexampleinSection3.1.2.3.2ModelSelectionModelselectionmeansassigningaweightof1tothemodelwhichisoptimalwithrespecttoaspeci ccriterion.Inmanysituationsitislikelythatthereismodelselectionuncertaintyinthesensethatdi erentsampleswouldleadtodi erentmodelchoices:sometimesavariableisincluded,andsometimesnot.Sometimes,thiscanmakethedistributionofparameterestimatesbimodal(SchomakerandHeumann,2014){andbootstrapbasedcon denceintervals(Section4)togetherwithgraphicalsummaries(Section5)canthereforebeagoodchoicewhenapplyingmodelselection.3.2.1CriterionBasedModelSelectionMAMIo ersmodelselectionbasedonthefollowingoptions:method="MS.criterion"[orsometimesmethod="MA.criterion"]andcriterion="AIC"orcriterion="BIC"orcriterion="GCV"orcriterion="CV".criterion="AIC"choosesthemodelwiththesmallestAICbasedonastepwisesearchwithstepAICfromMASS.Notethatthenumberofcandidatemodelsisthereforenot2paswithmethod="MA.criterion"[wheremodelselectionresultsaredisplayedinad-ditiontomodelaveragingresults].Thismeansthatmodelselectionresultscanpo-tentiallydi erbetweenthetwoapproaches.Thelatterispreferredifcomputationallyfeasible.Forlongitudinaldata,AICbasedmodelselectioncanonlybeutilizedwithmethod="MA.criterion"andnotwithth

14 equickermethod="MS.criterion".crite
equickermethod="MS.criterion".criterion="BIC"choosesthemodelwiththesmallestBICbasedonastepwisesearchwithstepAICfromMASS.Thesameconsiderationsasaboveapplyhereaswell,i.e.withrespecttolongitudinaldataandpotentiallydi eringresultstothosedisplayedaftermodelaveraging.ForbothAICandBICtheconsiderationsfromSection3.1.1apply.Forexample,thecritiqueofthesecriteriainsurvivalanalysisandmixedmodels.criterion="CV"togetherwithkfoldusesthek-foldcrossvalidationerrorformodelselectionbasedonasquaredlossfunction.ThisapproachisimplementedusingdredgeinMuMIn,andthereforeall2pmodelsarebeingevaluated.Thismaybetime-consumingandcriterion="GCV"(seebelow)isanalternative.Thedefaultsplitofthedataforcrossvalidationisbasedonthesequence1;2;:::;k;1;2;:::.Theoptioncvr=Trandomlyshuesthesplit.Crossvalidationcannotbeusedforsurvivalmodels.Itcanbeusedforlongitudinaldata,i.e.inthecontextofmixedmodels,butoneneedstobeawarethatthecrossvalidationpredictionsareusingtheaverageinterceptwithineachsubject.criterion="GCV"usesgeneralizedcrossvalidation,i.e.anapproximationtoleave-one-outcross-validation,formodelselectionandismuchquickerthancriterion="CV".Itcan'tbeusedforsurvivaldata,butforlongitudinaldata.Again,theimplementationutilizesdredgefromMuMInandthusall2pcandidatemodelsarebeingevaluated.12 3.2.2LASSOandShrinkageBasedModelSelectionModelselectioncanbedonewiththeLASSOestimatorasintroducedin(10).Thiscanbeachievedwiththefollowingoption:method="LASSO"andkfoldAsopposedtoLASSOaveraging,LASSObasedmodelselectioncanbeusedfornotonlythelinear,logisticandPoissonmodel,butalsofortheCoxproportionalhazardsmodel.Theimplementationisbasedoncv.glmnetfrompackageglmnet.EssentiallythesameconsiderationsasinSection3.1.3apply.Brie y:ThevarianceofeachsingleLASSOestimatorisbasedonbootstrapping,withadefaultofmax(B;100)bootstrapsamples.Thiscanbeloweredinfuturereleases,asthestandarderrormaynotnecessarilybeofinterest.Onemightarguethatcon denceintervalsfortheLASSOestimatorarenotmeaningful.Forsure,theyarenotcorrectedforthebiasintroducedbyshrinkageandwon'tachievenominalcoverage.Thedataissplitintoasequenceof1;2;:::;k;1;2;:::andthetuningparameterwhichminimizesthek-foldcrossvalidationerror(chosenviakfold)is

15 beingused.Theoptioncvr=Trandomlyshu
beingused.Theoptioncvr=Trandomlyshuesthesplit.Thelossfunctiontocalculatethecrossvalidationerroristhesquarederrorlossforlinearmodels,thedevianceforlogisticandPoissonmodels,andthepartiallikelihoodfortheCoxmodel.TousegeneralElasticNettypemodelselectionpassonalphatocv.glmnet,e.g.alpha=0.5.4InferenceIngeneral,con denceintervalsbasedon(6)arecalculated(ortheyarebasedontheverysimilar(7)ifcriterion="BIC+")5.SchomakerandHeumann(2014,Table3)showthattheseintervalscanworkquitewellforcriterionbasedmodelaveragingestimators;however,theywillunderestimatethevarianceformodelselectionestimators.Bootstrapcon denceinter-valscanhelptoimprovethecoverageofmodelselectionestimators.Moreover,distributionspostmodelselectionandpostmodelaveragingareoftennon-symmetric(HjortandClaeskens,2003,SchomakerandHeumann,2014,Fig.2-4)whichisanothermotivationforbootstrap-ping.Whenapplyingbootstrappingonehasofcoursetoimputethedataineachbootstrapsample.Bootstrapmodelselection/averagingcon denceintervals,afterimputation,canthusbegeneratedasfollows: 1)CreateBbootstrapsamplesoftheoriginaldata(includingmissingobservations)2)GenerateMimputedsetsofdataforeachbootstrapsample3)Ineachbootstrapsamplecalculateamodelaveraging(selection)estimatoroftheregressionparametersusingequation(5){basedontheapplicationofaparticularmodelaveraging/selectionschemeonthemultiplyimputeddata4)Construct1� con denceintervalsbasedonthe =2and1� =2percentilesoftheempiricaldistributionoftheBpointestimatesproducedinstep3 5Ifmethod="MMA"ormethod="JMA"nocon denceintervalsarecalculatedbydefault;seeSections3.1.2and3.1.4ondetailshowtocalculatecon denceintervalswiththesemethods.Notethatalsoforthesemethods(additional)bootstrapcon denceintervalsasdiscussedinthisSectioncan,inprinciple,becalculated(thoughtheymaynotalwaysbemeaningfulasMMAandJMAarepurepredictionmethods).13 Iftheoptioninference="+boot"ischosen,con denceintervalsaccordingtotheabovealgorithmarebeinggeneratedinadditiontothestandardcon denceintervalsfrom(6).Ifcomputationallyfeasible,itisrecommendedtoimplementthisapproachandplottheresults(seealsoSection5.1).Theoptioninference="+boot"needstobecomplementedwiththeoriginalunimputedd

16 ata(optionX.org)[becauseofstep2]andthenu
ata(optionX.org)[becauseofstep2]andthenumberofbootstrapsamplestobedrawn(optionB).Thepointestimatesreportedfrommamiarestillthepointestimatesaccordingto(5),andnotthearithmeticmeanofthebootstrapsamplesassuggestedbyTable1inSchomakerandHeumann(2014),thoughthelatterarereportedbyprint.mamiaswell.Bydefaulta95%con denceintervalisreported,butthiscanbechangedwiththeoptionCI.Con denceintervalsarebasedonatRdistributionasexplainedinSection1.3.2.IfM=1,forexampleundernomissingdataoracompletecaseanalysis,Risassumedtobein nityundermodelaveragingandthusastandardnormaldistributionisused.Formodelselection,anappropriatet-distributionisused,ifmeaningfulfortherespectivemodelclass.MoreaboutcombiningbootstrappingandmultipleimputationcanbefoundinSchomakerandHeumann(2018).5AnalysisandInterpretation5.1ExampleForareasonablyrealisticexampleonecouldlookatthehypotheticalHIVdatasetprovidedbyMAMI.ItcontainstypicalvariablesofHIVtreatmentresearchsuchasfollow-uptime(futime),eventofdeath(dead),baselinevariablessuchasCD4count(cd4),WHOstage(stage),weight(weight),andmanyothers.Thedatalooksasfollows:patienthospitalfutimedeadsexagecd4cd4slope6weightperiodhaemstagetbcm135800391281.888765200413.2184200231717113877-6.5082NA200110.726130NA331941013412412.0466622001NA3NA04351201221478.9213632004NA2005376601281874.6059NA200412.707630NA6322420136108.8271NA20019.48033NANASincedataofCD4count,haemoglobin,WHOstage,tuberculosisandcryptococcalmenin-gitis(allatbaseline)aremissing,onecouldimputethedataunderamissingatrandomassumption.ThiscouldbedonewithAmeliaIIinR,seeHonakeretal.(2011)fordetails. library(MAMI)library(Amelia)data(HIV)HIV impamelia(HIV,m=5,idvars="patient",logs=c("futime","cd4"),noms=c("hospital","sex","dead","tb","cm"),ords=c("period","stage"),bounds=matrix(c(3,7,9,11,0,0,0,0,3000,5000,200,150),ncol=3,nrow=4)) ToselectamodelwithAIConthemultiplyimputeddata,wecouldsimplypassittomami.Pickingtheoptionsmodel="cox"andoutcome=c("futime","dead")makesitclearthatweareinterestedinriskfactorsforthehazardofdeath,modeledbyaCoxproportionalhazardsmodel.Withmethod="MS.criterion"andcriterion="AIC"wemakeclearthat14 weareinterestedinmodelselectionwithAIC.Toensurethatthecategoricalvariablesaretreated

17 assuchweuseadd.factor.Now,wemayalreadykn
assuchweuseadd.factor.Now,wemayalreadyknowthatthein uenceofCD4countonmortalityistypicallysquared,andthathaemoglobinlevelsmeandi erentthingsformenandwomen,andthereforeaddtherespectivetransformationsandinteractionswithadd.transformationandadd.interaction.Twovariables(patientidenti er,andaverageCD4slope)maynotbeofinterestandberemoved.Wewanttheresultstobereportedashazardratios(ratherthancoecients),andthususereport.exp=T.Thecodelooksasfollows. #Modelselectionafterimputationmami(HIV imp,model="cox",outcome=c("futime","dead"),method="MS.criterion",criterion="AIC",add.factor=c("period","hospital","stage"),add.transformation=c("cd4^2"),add.interaction=list(c("haem","sex")),report.exp=TRUE,var.remove=c("patient","cd4slope6")) TogetasensehowresultswouldvaryifweselectedthemodelwithLASSOratherthanAIC,wesimplyreplacemethod="MS.criterion"withmethod="LASSO". #ModelselectionafterimputationwithLASSOmami(HIV imp,model="cox",outcome=c("futime","dead"),method="LASSO",add.factor=c("period","hospital","stage"),add.transformation=c("cd4^2"),add.interaction=list(c("haem","sex")),report.exp=TRUE,var.remove=c("patient","cd4slope6")) Bothmethodssuggesttopickallvariables,butatthesametimecon denceintervalsforsomevariablesareverywide.Let'ssayweareinterestedinmodelselectionuncertainty,there-foreusemodelaveraging,butwiththemoreparsimoniousBICinstead.WemaypickBIC+forareducedsetofcandidatemodelsandthereforequickerresults. #Modelaveragingafterimputationmami(HIV imp,model="cox",outcome=c("futime","dead"),method="MA.criterion",criterion="BIC+",add.factor=c("period","hospital","stage"),add.transformation=c("cd4^2"),add.interaction=list(c("haem","sex")),report.exp=TRUE,var.remove=c("patient","cd4slope6")) Theoutputlooksasfollows:Estimatesformodelaveraging:EstimateLowerCIUpperCIfactor.hospital..4------factor.hospital..5------sex0.9148770.758951.102839age1.0240.9774671.072749cd40.9980270.9941041.001965wt0.9749760.9272721.025134factor.period..20040.9992130.9936141.004844factor.period..20070.9997640.9980831.001448haem0.8907710.7072111.121974factor.stage..31.4250540.6511793.118619factor.stage..42.409720.42005713.823716tb1.4459790.6000623.484396cm1.0112040.9577091.067687I.cd4.2.10.9999991.000002s

18 ex.haem0.9888290.9644191.013856Estimates
ex.haem0.9888290.9644191.013856Estimatesformodelselection:15 EstimateLowerCIUpperCIfactor.hospital..4------factor.hospital..5------sex0.948120.6135471.46514age1.0238150.9775241.072298cd40.998020.9941241.001931wt0.975070.9275271.025051factor.period..2004------factor.period..2007------haem0.8926260.7104381.121534factor.stage..31.4140490.6413433.117733factor.stage..42.3854340.42387413.424494tb1.443050.5163334.033044cm------I.cd4.2.------sex.haem0.9840110.9259891.045668Posterioreffectprobabilities:agefactor.stage.wthaemcd4tbsex.haem1.001.001.001.000.900.820.42sexI.cd4.2.cmfactor.period.factor.hospital.0.350.140.030.000.00Onecanseefromtheresultsofbothmodelselectionandaveragingthatthevariablehospitalisnotpickedandhasavariableimportanceof0.Someofthevariableswhichhadwidercon denceintervalswithAIC,arenownotbeingselectedanymorebyBIC(e.g.cryp-tococcalmeningitisandCD42).Thevariableimportance(i.e.posteriore ectprobabilities)suggestthatthereissomeuncertaintyaroundvariablessuchastheaddedinteraction,orsex.So,whattodo?Wecouldrepeattheanalysiswithbootstrapcon denceintervalstogetabettersenseofthemodelselectionandmodelaveragingresults.Thereforeweusetheoptionsinference="+boot",B=200,X.org=HIV.Weknowthatbiologicallyitmakessensethatco-infectionsarerelevanttopredictmortalitybutthattheassociationsinthedataaremaybenotstrongenoughtoimmediatelypickthem.Thus,wecouldopttodomodelaveragingwithAICratherthanBIC,becausethelatterhasamoreparsimoniousapproach.KnowingthatmamiutilizesdredgefromMuMIn,wecouldpassontheoptionsubsettospecifythatcandidatemodelsshouldcontainsquaredCD4countonlyiflinearCD4countiscontained,andthattheinteractionofsexandhaemoglobinshouldonlybecontainedinthemodelswherethemaine ectiscontainedtoo.Thiscanbeutilizedwithdependencychains(dc),see?dredgeforhelp.Then,thesyntaxlooksasfollows: #Modelaveragingafterimputation+bootstrappingm1mami(HIV imp,model="cox",outcome=c("futime","dead"),method="MA.criterion",criterion="AIC",add.factor=c("period","hospital","stage"),add.transformation=c("cd4^2"),add.interaction=list(c("haem","sex")),report.exp=TRUE,var.remove=c("patient","cd4slope6"),inference="+boot",B=200,X.org=HIV,subset=dc("cd4","I(cd4^2)")&&dc("sex","sex:haem")&&dc("haem","s

19 ex:haem"),print.time=TRUE)summary(m1)plo
ex:haem"),print.time=TRUE)summary(m1)plot(m1) 16 �summary(m1)...Estimatesformodelselection(basedon200bootstrapsamples):EstimateLCIUCIBootLCIBootUCIVIage1.0239070.9775611.0724491.0116441.0342061cd40.9974170.9918871.0029790.9949140.9993470.99cm1.0784470.5818981.9987140.7969851.8816470.36factor(hospital)4------0.6784801.0000000.23factor(hospital)5------0.7833001.143629--factor(period)20040.7606420.4444891.3016630.5861181.0000000.7factor(period)20070.9085180.7408731.1140980.5451001.424140--factor(stage)31.3824790.6476522.9510410.9614861.9617631factor(stage)42.2860760.44072911.8579591.5202323.193898--haem0.8794140.6809121.1357840.8303030.9531941haem:sex------0.8987161.0164000.31I(cd4^2)1.0000020.9999941.0000091.0000001.0000050.58sex0.7635310.4489491.2985430.5915032.2689330.94tb1.5149130.6365573.6052741.0440471.9030040.94wt0.9754240.928471.0247530.9655030.9859131Theresultswegotanowmuchmorenuanced:estimatesofcoecientsfromvariableswewereunsureabout,havenon-normaldistributions.Forexample,the95%bootstrapcon denceintervalofthehazardratioofcryptococcalmeningitisisbimodal,seealsoFigures1aand1b[whichareprovidedbyplot(m1)].ComprehensivemodelselectionandaveragingwithAICsuggestsamoderatee ectofvariablessuchasthetransformation,butnomajorroleoftheinteraction.Thiswouldhavepotentiallyremainedundiscoveredwithoutbootstrapping.Figure1showsthatfromanexplorativeperspectivepost-modelaveraging/selectionplotscanhelpustogetusabettersenseofwhichvariablesarepotentiallyrelevantandwhichnot.Giventhevarietyofoptionshowtoperformandreportmodelselectionandaveragingresults,wegivesomeguidanceinthenextsectiononhowtoreportresults.5.2SuggestedReportinginPublicationsBydefault,print.mamilistsabigvarietyofresultsandallofthemcouldbereported.How-ever,amoreconciseapproachmaybebetterunderstood. Sincemodelaveragingisnotalwayswell-known,asopposedtomodelselection,agoodoptioncouldbetoreporti)theresultsofthefullmodel(withoutmodelselection,asestimatesareconsistent),ii)theresultsafterthepreferredmodelselectionprocedurehasbeenapplied{togetherwithbootstrapcon denceintervals(crucial,suchthattheintervalsre ectmodelselectionuncertainty),andiii)thevariableimportancemeasurerelatedtotheweightsofmodelaveraging

20 (togetasenseofmodelselectionuncertainty)
(togetasenseofmodelselectionuncertainty).AtemplateisgiveninTable1(usesummary()). Forexample,theresultsreportedinPorteretal.(2015){seeFigure2{areverysimilartothesuggestedtemplate,exceptthatcon denceintervalsarenotbasedonbootstrappingandunivariateresultsarereportedaswell.Karamchandetal.(2016)donotreportresultsfrommodelaveraging,e.g.viavariableimportancemeasures,butareotherwisesimilarintheirapproachonhowtoreportresultsofmodelselectionafterimputation;seeFigure3.AnothergoodexampleisthetablefromVisseretal.(2012)[Figure4],whereBayesianposteriore ectprobabilities(basedonBayesianModelAveragingafterimputation)aregiven,butnoresultsofmodelselection.17 (a)cryptococcalmeningitis (b)cryptococcalmeningitis (c)squaredCD4 (d)sex:haemoglobin(interaction)Figure1:Bootstrapdistribution,aftermodelselectionandmodelaveragingTable1:Atemplatetoreportresultsfrommami(usesummary()) FullModel ModelSelection Variable 95%CI 95%CI VI Variable1 Variable2 Variable3 Variable4 Variable5 ... ...... ...... ... yFootnotesonscreening,modelassumptionsetc.18 Figure2:Table3fromPorteretal.(2015) Figure3:Table3fromKaramchandetal.(2016)19 Figure4:Table2fromVisseretal.(2012)6Miscellaneous6.1ComputationTimeInmanysettings,thecomputationtimecanbelong;particularlywhenperformingmodelaveraging.Whenapplyingbootstrappingtoobtaincon denceintervals,estimationcantakeevenlonger.Todecreasecomputationtimeseveraloptionsarepossible.eachoftheseoptionseitherdecreasesthenumberofcandidatemodelsand/orparallelizethecomputations.(i)Ifmodelaveragingormodelselectionisbasedonacriterion,andmamiaccessesdredgefrompackageMuMInasdescribedinSection3.1.1andSection3.2.1,thenrestrictionsonthenumberofcandidatemodelscanbepassedontodredge.Type?dredgetolearnmore,andpossiblyseeExample4in?mamiandthelastexampleinSection5.1.(ii)Forcriterionbasedmodelaveraging(orselectionwithcrossvalidation)asdescribedin(i),theoptioncandidate.modelscanbeusedaswell.Thisisasimplewrappertoreducethemaximumamountofvariablesinthecandidatemodelstohalfofallvariables(candidate.models="restricted")orafourthofallvariables(candidate.models="veryrestricted").Shouldbeusedonlyifoneissurethatthisisappropriate.(iii)Asindicatedabove,theoptioncriterion="BIC+"utilizesBayesi

21 anModelAveragingbycallingthepackageBMA.T
anModelAveragingbycallingthepackageBMA.Thisimplementationdoesnotevaluateallcandidatemod-els,butusesabranch-and-boundalgorithmtoreducethenumberofcandidatemodels.Thenumberofcandidatemodels(ineachimputeddataset)isprintedbymami.Thus,criterion="BIC+"isaviablealternativetocriterion="BIC"ifthenumberofcandidatemodelsislarge.(iv)Similarly,whenusingcriterion="BIC+"andthereforeaccessingbic.survorbic.glmfromBMA,onecanadjustOccam'sWindow(Hoetingetal.,1999)usingtheoptionOR,i.e.basedontheratiobetweentheposteriormodelprobabilitiesofthe\best"modelandthecandidatemodelunderconsideration,modelswithratiosgreaterthanCarerejected.20 SincewithinBMA'sbranch-and-boundalgorithmrejectionofamodelmeansalsothatnestedsubmodelscanberejected,thisleadstoreducedcomplexity.ThedefaultofCis20,andareductiondownto5canpossiblybeo.k.incomplexsettings.SeeHoetingetal.(1999,Section3)formoredetails,andExample4from?mamiforapracticalexample.(v)Foraverylargeamountofvariablesapre-screeningstepcanbeanoption.Thisimpliesthatthemodelselection(averaging)estimatoristhenconditionalonthispre-screeningstep.AnecientwaytoscreenvariablesistousetheLASSOestimatorasintroducedin(10).Thisisimplementedinthepackagebytheoptionscreen.Onehastospecifytheamountofvariablesthatshouldbeexcludedbeforemodelselection/averagingafterim-putationisutilized.Then,basedontheLASSOpathanappropriateamountofvariablesareexcludedandmamireportswhichone.Screeningisdoneonthe rstimputeddataset.mamidealsautomaticallywiththeconsequencesrelatedtosuggestedinteractions,transformationsetc.Screeningworksforallmodelclasses,butonlyforcross-sectionaldata.Screeningmightnotworkinconjunctionwithcomplexsub-options,forexamplewhenaccessingsubsetindredge.Screeningisrecommendediftheamountofvariablesisverylarge.Forexample,whendealingwith100variables,andhavingtheknowledgethatonlyasmallsubsetislikelyrelevant,onecouldscreenaway60variablesandthenusecriterion="BIC+"afterwards.Ifscreeningisutilized,itissuggestedtoreporttheexcludedvariablesinafootnote.(vi)Thebestoptiontosavetimeisparallelizationusingtheoptionncores.Onehastosimplysupplythenumberofavailablecores(threads)andmamiautomaticallyparallelizespartsofthecodetospeedupcalculations.Notethatthi

22 s(experimental)optionsisstillinitsinfanc
s(experimental)optionsisstillinitsinfancy.Ithasbeentestedheavily,butmostlyonWindowsmachines.Pleasereportanybugs.Therewillbenospeedupforverysimpleproblems,orthecalculationmayeventakelonger.However,youcanexpectaconsiderablespeedupifi)youhavemultiplyimputeddatasets(i.e.M��1)and/orii)youareusingoptioninference="+boot"forbootstrapcon denceintervals(exceptinconjunctionwithmice)oriii)M=1andmodelaveragingiscriterionbased("MA.criterion")andverycomplex(manymodelsand/orcomplexmodels).IntheexamplefromSection5.1,utilizing7coresonaWindow'smachineimprovedthecomputationtimebyafactorof4:2.IfaCoxmodelis titmaybenecessarytoloadlibrary(survival) rst;ifboot-strappingisuseditmaybeneededtoloadlibrary(boot) rst;andifM=1(com-pletecases,nomissingdata,oneimputation),andalsomethod="MA.criterion"ormethod="MS.criterion"andcriterionis"CV"or"GCV",itisneededto rstloadlibrary(snow).Thesameappliesinthecontextofmixedmodels.mamiwillnotifytheuserifaneededpackagehasnotbeenloaded.Currentlytheclusterforparallelizationissetup(andstopped)automatically,mostlyforconvenience.Infuturereleasestherewillbemore exibleoptionsforadvancedusersanddi erentcomputingenvironments.set.seed()willnotworkwithparallelization.Thus,bootstrapcon denceintervalsarenotfullyreproducible.Therearewaysaroundthis,andguidancewillbegiveninfuture.Usingparallelizationmeansthatlesswarningmessagesareprintedfrommami.21 6.2LimitationsoftheApproachModelSelectionandmodelaveragingestimatorsaregenerallybiased.Improvingthecoverageprobabilitywiththeapproachimplementedinmamidoesn'talterthisconclu-sion.Ifthenumberofvariablesisreasonable, ttingthefullmodelinadditiontotheselected/averagedmodelisrecommended.Theimplementationinmamiismeanttobeuseful,stableand exible.However,forextremelycomplexsituations,ormorecomplexmodels,mamidoesnotwork.Here,amanualimplementationisneeded.Re-samplinghaslimitations,isnotalwaysvalid,andthisappliestomodelaveragingtoo.GoodreferencesareLeebandPotscher(2005,2006,2008)andPotscher(2006).Therearesimplepragmaticalternativestooursuggestedapproach:forexampleselectingonlyvariableswhichareselectedineachimputeddataset;orstackingtheimputeddata,see

23 Woodetal.(2008).6.3OptimalModelAveraging
Woodetal.(2008).6.3OptimalModelAveragingforPrediction:SuperLearningThemotivationforoptimalmodelaveraging,i.e.Mallow'sModelAveraging,JackknifeModelAveraging,andShrinkageAveragingEstimation,isessentiallyprediction.Forthisreasonpredictmethodsareavailable:predict.mma,predict.jma,predict.lae.Forexample: data(Prostate)m1mma(Prostate,lpsa.,ycol='lpsa')predict(m1)predict(m1,newdata=Prostate[1,]) Dependingonthespeci cproblem,optimalmodelaveragingmaybeagoodpredictionalgorithmornot.Tochooseandcombinethebestpredictionmethods,superlearningcanbeused.Superlearningmeansconsideringasetofpredictionalgorithms,forexampleregressionmodels,shrinkageestimatorsorregressiontrees.Insteadofchoosingthealgorithmwiththesmallestcrossvalidationerror,superlearningchoosesaweightedcombinationofdi erentalgorithms,thatistheweightedcombinationwhichminimizesthecrossvalidationerror.Itcanbeshownthatthisweightedcombinationwillperformatleastasgoodasthebestalgorithm,ifnotbetter(VanderLaanetal.,2008).Onemayinterpretthisprocedureasmodelaveraginginabroadersense.TheinterestedreaderisreferredtoVanderLaanandPetersen(2007)andVanderLaanandRose(2011),andthereferencestherein,formoredetails.Brie y,MAMIcontainsseveralwrappersthatcanbeusedforsuperlearning.Theyarelistedandexplainedbytyping: listSLWrappers() NotethatthepackageSuperLearner(Polleyetal.,2017)isrequired.Asimpleexamplefrom?listSLWrapperswouldbe:22 library(SuperLearner)#needstobeinstalledSL.libraryc('SL.glm','SL.stepAIC','SL.mean','SL.mma.int','SL.jma')SL.library2c('SL.glm','SL.stepAIC','SL.mean','SL.lae2')data(Prostate)P1SuperLearner(Y=Prostate[,9],X=Prostate[,-9],SL.library=SL.library,verbose=T)P2SuperLearner(Y=Prostate[,5],X=Prostate[,-5],family='binomial',SL.library=SL.library2,verbose=T)P2$coefP2$SL.predict Intheaboveexamplebothacontinuousoutcome(P1)andabinaryoutcome(P2)ispre-dicted{usinggeneralizedlinearmodels,thearithmeticmean,modelsselectedbyAIC,LASSOaveragingincludingsquaredvarriables(`SL.lae2'),amongothers.Toseetheweighteachalgo-rithmcontributestothepredictiontypeP2$coef.ThepredictionitselfisP2$SL.predict.Superlearningisoftenusedforestimationofcausalestimandswithtargetedmaximumlikelihoodestimation.Theimplementedwrapperscanbeused,andhavebeentes

24 ted,withpackagetmle(GruberandvanderLaan,
ted,withpackagetmle(GruberandvanderLaan,2012)andltmle(Lendleetal.,2017).6.4MiscellaneousOtheroptionsinmami.report.exp:stronglyrecommendedtosetreport.exp=Twhen ttingalogisticmodel,aPoissonmodelorCoxmodeltoobtaintheoddsratio,incidencerateratioandhazardratiorespectively.Canalsobeofinterestwhentheoutcomeofalinearmodelislog-transformedandthusalog-linearmodelisinterpreted.print.warnings:ifsetasTRUE,mamiprintsnotonlywarningsbutalsoplentyofinforma-tionontheprogressofthe ttingprocedureandonimpliedassumptions.ThedefaultisTRUEtobeastransparentaspossible,butmaybesetasFALSEinsimulations.print.time:primarilyintendedtoforecastthetimewheninference="+boot".Simplymultipliesthetimeofthemodelselection/averagingprocedureontheoriginaldatatimestheintendedbootstrapruns.23 IndexBMA,7bic.glmOR,20MAMIjma,8,11bsa,9,11calc.var,9,11model.setup,9,12lae,10B.var,10mami,5B,13,14CI,14X.org,14add.factor,7add.interaction,7add.stratum,7add.transformation,7aweights,7candidate.models,20criterion,7cvr,11{13id,6inference,14kfold,11,13method,7,8,10,11,13model,6outcome,7print.time,23print.warnings,23report.exp,23screen,6,21var.remove,6mma,8plot.mami,17predict,22print.mami,14,17summary.mami,16,17MASSstepAIC,12MuMIn,7dredge,12,20dc,16subset,16corpcor,10glmnetcv.glmnet,11alpha,11,13nlambda,11lme4,8quadprog,9analysismodel,6bootstrapping,10,13citation,2crossvalidation,10imputationmodel,5installation,2interpretation,14JackknifeModelAveraging,11LASSO,10modelaveraging,7criterionbased,7JMA,11LASSO,10MMA,8modelselection,12criterionbased,12LASSO,13multipleimputation,3ncores,21parallelization,21superlearner,22TMLE,2324 ReferencesBarton,K.(2017).MuMIn:Multi-ModelInference.Rpackageversion1.16.6/r405.Buckland,S.T.,K.P.Burnham,andN.H.Augustin(1997).Modelselection:anintegralpartofinference.Biometrics53,603{618.Burnham,K.andD.Anderson(2002).Modelselectionandmultimodelinference.Apracticalinformation-theoreticapproach.Springer,NewYork.Chat eld,C.(1995).Modeluncertainty,dataminingandstatisticalinference.JournaloftheRoyalStatisticalSocietyA158,419{466.Draper,D.(1995).Assessmentandpropagationofmodeluncertainty.JournaloftheRoyalStatisticalSocietyB57,45{97.Gruber,S.andM.J.vanderLaan(2012).tmle:AnRpackagefortargetedmaximumli

25 kelihoodestimation.JournalofStatisticalS
kelihoodestimation.JournalofStatisticalSoftware51(13),1{35.Hansen,B.E.(2007).Leastsquaresmodelaveraging.Econometrica75,1175{1189.Hansen,B.E.andJ.Racine(2012).Jackknifemodelaveraging.JournalofEconometrics,167,38{46.Hjort,L.andG.Claeskens(2003).Frequentistmodelaverageestimators.JournaloftheAmericanStatisticalAssociation98,879{945.Hoeting,J.A.,D.Madigan,A.E.Raftery,andC.T.Volinsky(1999).Bayesianmodelaveraging:atutorial.StatisticalScience14,382{417.Honaker,J.andG.King(2010).Whattodoaboutmissingvaluesintimeseriescross-sectiondata.AmericanJournalofPoliticalScience54,561{581.Honaker,J.,G.King,andM.Blackwell(2011).AmeliaII:Aprogramformissingdata.JournalofStatisticalSoftware45(7),1{47.Karamchand,S.,R.Leisegang,M.Schomaker,G.Maartens,L.Walters,M.Hislop,J.A.Dave,N.S.Levitt,andK.Cohen(2016).Riskfactorsforincidentdiabetesinacohorttaking rst-linenonnucleosidereversetranscriptaseinhibitor-basedantiretroviraltherapy.Medicine(Baltimore)95(9),e2844.Leeb,H.andB.M.Potscher(2005).Modelselectionandinference:factsand ction.EconometricTheory21,21{59.Leeb,H.andB.M.Potscher(2006).Canoneestimatetheconditionaldistributionofpost-model-selectionestimators?AnnalsofStatistics34,2554{2591.Leeb,H.andB.M.Potscher(2008).Canoneestimatetheunconditionaldistributionofpost-model-selectionestimators?EconometricTheory24,338{376.Lendle,S.D.,J.Schwab,M.L.Petersen,andM.J.vanderLaan(2017).ltmle:AnRpackageimplementingtargetedminimumloss-basedestimationforlongitudinaldata.JournalofStatisticalSoftware81(1),1{21.Lipsitz,S.,M.Parzen,andL.Zhao(2002).Adegrees-of-freedomapproximationinmultipleimputation.JournalofStatisticalComputationandSimulation72,309{318.Polley,E.,E.LeDell,C.Kennedy,andM.vanderLaan(2017).SuperLearner:SuperLearnerPrediction.Rpackageversion2.0-22.Porter,M.,M.A.Davies,M.K.Mapani,H.Rabie,S.Phiri,J.Nuttall,L.Fairlie,K.G.Technau,K.Stin-son,R.Wood,M.Wellington,A.D.Haas,J.Giddy,F.Tanser,andB.Eley(2015).Outcomesofinfantsstartingantiretroviraltherapyinsouthernafrica,2004-2012.JournalofAcquiredImmuneDe ciencySyn-dromes69(5),593{601.Potscher,B.(2006).Thedistributionofmodelaveragingestimatorsandanimpossibilityresultregardingitsestimation.InH.Ho,C.Ing,andT.Lai(Eds.),IMSLectureNotes:Timeseriesandrelat

26 edtopics,Volume52,pp.113{129.Raftery,A.,
edtopics,Volume52,pp.113{129.Raftery,A.,J.Hoeting,C.Volinsky,I.Painter,andK.Y.Yeung(2017).BMA:BayesianModelAveraging.Rpackageversion3.18.7.25 Rubin,D.andN.Schenker(1986).Multipleimputationforintervalestimationfromsimplerandomsampleswithignorablenonresponse.JournaloftheAmericanStatisticalAssociation81,366{374.Rubin,D.B.(1996).Multipleimputationafter18+years.JournaloftheAmericanStatisticalAssocia-tion91(434),473{489.Schomaker,M.(2012).Shrinkageaveragingestimation.StatisticalPapers53,1015{1034.Schomaker,M.andC.Heumann(2014).Modelselectionandmodelaveragingaftermultipleimputation.ComputationalStatisticsandDataAnalysis71,758{770.Schomaker,M.andC.Heumann(2018).Bootstrapinferencewhenusingmultipleimputation.StatisticsinMedicine37(14),2252{2266.Schomaker,M.andC.Heumann(2019).Whenandwhennottouseoptimalmodelaveraging.StatisticalPapersinpress.Tibshirani,R.(1996).Regressionshrinkageandselectionviathelasso.JournaloftheRoyalStatisticalSocietyB58,267{288.vanBuuren,S.andK.Groothuis-Oudshoorn(2011).mice:Multivariateimputationbychainedequationsinr.JournalofStatisticalSoftware45(3),1{67.VanderLaan,M.andM.Petersen(2007).Statisticallearningoforigin-speci cstatisticallyoptimalindividu-alizedtreatmentrules.InternationalJournalofBiostatistics3,Article3.VanderLaan,M.,E.Polley,andA.Hubbard(2008).Superlearner.StatisticalApplicationsinGeneticsandMolecularBiology6,Article25.VanderLaan,M.andS.Rose(2011).TargetedLearning.Springer.Visser,M.,M.Stead,G.Walzl,R.Warren,M.Schomaker,H.Grewal,E.Swart,andG.Maartens(2012).Baselinepredictorsofsputumconversioninpulmonarytuberculosis:importanceofcavities,smoking,timetodetectionandW-Beijinggenotype.PLoSONE7,e29588.Volinsky,C.T.,D.Madigan,A.E.Raftery,andR.A.Kronmal(1997).Bayesianmodelaveraginginproportionalhazardmodels.assessingtheriskofastroke.JournaloftheRoyalStatisticalSocietySeriesC-AppliedStatistics46(4),433{448.Wan,A.T.K.,X.Zhang,andG.H.Zou(2010).Leastsquaresmodelaveragingbymallowscriterion.JournalofEconometrics156,277{283.White,I.,P.Royston,andA.Wood(2011).Multipleimputationusingchainedequations.StatisticsinMedicine30,377{399.Wood,A.,I.White,andP.Royston(2008).Howshouldvariableselectionbeperformedwithmultiplyimputeddata?StatisticsinMedicine27,3227{3246.2

Related Contents


Next Show more