/
Computational methods for mixed models Douglas Bates Department of Statistics University Computational methods for mixed models Douglas Bates Department of Statistics University

Computational methods for mixed models Douglas Bates Department of Statistics University - PDF document

alida-meadow
alida-meadow . @alida-meadow
Follow
545 views
Uploaded On 2014-12-12

Computational methods for mixed models Douglas Bates Department of Statistics University - PPT Presentation

In this vignette we describe the formulation of these models and the compu tational approach used to evaluate or approximate the loglikelihood of a modeldataparameter value combination 1 Introduction The lme4 package provides functions to 64257t and ID: 23065

this vignette

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Computational methods for mixed models D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

ComputationalmethodsformixedmodelsDouglasBatesDepartmentofStatisticsUniversityofWisconsin{MadisonOctober13,2020AbstractThelme4packageprovidesRfunctionsto tandanalyzeseveraldi erenttypesofmixed-e ectsmodels,includinglinearmixedmodels,generalizedlinearmixedmodelsandnonlinearmixedmodels.Inthisvignettewedescribetheformulationofthesemodelsandthecompu-tationalapproachusedtoevaluateorapproximatethelog-likelihoodofamodel/data/parametervaluecombination.1IntroductionThelme4packageprovidesfunctionsto tandanalyzelinearmixedmodels,generalizedlinearmixedmodelsandnonlinearmixedmodels.Thesemodelsarecalledmixed-e ectsmodelsor,moresimply,mixedmodelsbecausetheyincorporateboth xed-e ectsparameters,whichapplytoanentirepopula-tionortocertainwell-de nedandrepeatablesubsetsofapopulation,andrandome ects,whichapplytotheparticularexperimentalunitsorobser-vationalunitsinthestudy.Suchmodelsarealsocalledmodelsbecausetherandome ectsrepresentlevelsofvariationinadditiontotheper-observationnoisetermthatisincorporatedincommonstatisticalmod-elssuchaslinearregressionmodels,generalizedlinearmodelsandnonlinearregressionmodels.Webeginbydescribingcommonpropertiesofthesemixedmodelsandthegeneralcomputationalapproachusedinthelme4package.Theestimatesoftheparametersinamixedmodelaredeterminedasthevaluesthatoptimize1 anobjectivefunction|eitherthelikelihoodoftheparametersgiventheobserveddata,formaximumlikelihood(ML)estimates,orarelatedobjectivefunctioncalledtheREMLcriterion.Becausethisobjectivefunctionmustbeevaluatedatmanydi erentvaluesofthemodelparametersduringtheoptimizationprocess,wefocusontheevaluationoftheobjectivefunctionandacriticalcomputationinthisevalution|determiningthesolutiontoapenalized,weightedleastsquares(PWLS)problem.ThedimensionofthesolutionofthePWLSproblemcanbeverylarge,perhapsinthemillions.Furthermore,suchproblemsmustbesolvedrepeat-edlyduringtheoptimizationprocesstodetermineparameterestimates.ThewholeapproachwouldbeinfeasiblewereitnotforthefactthatthematricesdeterminingthePWLSproblemaresparseandwecanusesparsematrixstorageformatsandsparsematrixcomputations(Davis,2006).Inparticu-lar,thewholecomputationalapproachhingesontheextraordinarilyecientmethodsfordeterminingtheCholeskydecompositionofsparse,symmetric,positive-de nitematricesembodiedintheCHOLMODlibraryofCfunctions(Davis,2005).Inthenextsectionwedescribethegeneralformofthemixedmodelsthatcanberepresentedinthelme4packageandthecomputationalapproachembodiedinthepackage.Inthefollowingsectionwedescribeaparticularformofmixedmodel,calledalinearmixedmodel,andthecomputationaldetailsforthosemodels.Inthefourthsectionwedescribecomputationalmethodsforgeneralizedlinearmixedmodels,nonlinearmixedmodelsandgeneralizednonlinearmixedmodels.2FormulationofmixedmodelsAmixed-e ectsmodelincorporatestwovector-valuedrandomvariables:then-dimensionalresponsevector,Y,andtheq-dimensionalrandome ectsvec-tor,B.Weobservethevalue,y,ofY.WedonotobservethevalueofB.TherandomvariableYmaybecontinuousordiscrete.Thatis,theobserveddata,y,maybeonacontinuousscaleortheymaybeonadiscretescale,suchasbinaryresponsesorresponsesrepresentingacount.Inourformulation,therandomvariableBisalwayscontinous.WespecifyamixedmodelbydescribingtheunconditionaldistributionofBandtheconditionaldistribution(YjB=b).2 2.1TheunconditionaldistributionofBInourformulation,theunconditionaldistributionofBisalwaysaq-dimensionalmultivariateGaussian(or\normal")distributionwithmean0andwithapa-rameterizedcovariancematrix,BN0;2()0():(1)Thescalar,,in(1),iscalledthecommonscaleparameter.Aswewillseelater,notalltypesofmixedmodelsincorporatethisparameter.Wewillinclude2inthegeneralformoftheunconditionaldistributionofBwiththeunderstandingthat,insomemodels,1.Theqqmatrix(),whichisaleftfactorofthecovariancematrix(when=1)ortherelativecovariancematrix(when=1),dependsonanm-dimensionalparameter.Typicallymq;intheexamplesweshowbelowitisalwaysthecasethatm5,evenwhenqisinthethousands.Thefactthatmisverysmallisimportantbecause,asweshallsee,deter-miningtheparameterestimatesinamixedmodelcanbeexpressedasanoptimizationproblemwithrespecttoonly.Theparametermaybe,andtypicallyis,subjecttoconstraints.Foreaseofcomputation,werequirethattheconstraintsbeexpressedas\box"constraintsoftheformiLiiU;i=1;:::;mforconstantsiLandiU;i=1;:::;m.WeshallwritethesetofsuchconstraintsasLR.Thematrix()isrequiredtobenon-singular(i.e.invertible)whenisnotontheboundary.2.2Theconditionaldistribution,(YjB=b)Theconditionaldistribution,(YjB=b),mustsatisfy:1.Theconditionalmean,YjB(b)=E[YjB=b],dependsonbonlythroughthevalueofthelinearpredictor,Zb+X ,where isthep-dimensional xed-e ectsparametervectorandthemodelmatrices,ZandX,are xedmatricesoftheappropriatedimension.Thatis,thetwomodelmatricesmusthavethesamenumberofrowsandmusthaveqandpcolumns,respectively.ThenumberofrowsinZandXisamultipleofn,thedimensionofy.2.Thescalardistributions,(YijB=b);i=1;:::;n,allhavethesameformandarecompletelydeterminedbytheconditionalmean,YjB(b)3 and,atmost,oneadditionalparameter,,whichisthecommonscaleparameter.3.Thescalardistributions,(YijB=b);i=1;:::;n,areindependent.Thatis,thecomponentsofYareconditionallyindependentgivenB.Animportantspecialcaseoftheconditionaldistributionisthemultivari-ateGaussiandistributionoftheform(YjB=b)N(Zb+X ;2In)(2)whereIndenotestheidentitymatrixofsizen.Inthiscasetheconditionalmean,YjB(b),isexactlythelinearpredictor,Zb+X ,asituationwewilllaterdescribeasbeingan\identitylink"betweentheconditionalmeanandthelinearpredictor.Modelswithconditionaldistribution(2)arecalledlinearmixedmodels.2.3Achangeofvariableto\spherical"randome ectsBecausetheconditionaldistribution(YjB=b)dependsonbonlythroughthelinearpredictor,itiseasytoexpressthemodelintermsofalineartrans-formationofB.Wede nethelineartransformationfromaq-dimensional\spherical"Gaussianrandomvariable,U,toBasB=()U;UN(0;2Iq):(3)(Theterm\spherical"referstothefactthatcontoursofconstantprobabilitydensityforUarespherescenteredatthemean|inthiscase,0.)Whenisnotontheboundarythisisaninvertibletransformation.Whenisontheboundarythetransformationcanfailtobeinvertible.However,wewillonlyneedtobeabletoexpressBintermsofUandthattransformationiswell-de ned,evenwhenisontheboundary.Thelinearpredictor,asafunctionof,is()=Z()+X :(4)Whenwewishtoemphasizetheroleofthemodelparameters,and ,intheformulationof,wewillwritethelinearpredictoras(;; ).4 2.4Theconditionaldensity(UjY=y)Becauseweobserveyanddonotobservebor,theconditionaldistributionofinterest,forthepurposesofstatisticalinference,is(UjY=y)(or,equiv-alently,(BjY=y)).ThisconditionaldistributionisalwaysacontinuousdistributionwithconditionalprobabilitydensityfUjY(jy).WecanevaluatefUjY(jy),uptoaconstant,astheproductoftheuncon-ditionaldensity,fU(),andtheconditionaldensity(ortheprobabilitymassfunction,whicheverisappropriate),fYjU(yj).Wewritethisunnormalizedconditionaldensityash(jy;; ;)=fYjU(yj;; ;)fU(j):(5)Wesaythathisthe\unnormalized"conditionaldensitybecauseallweknowisthattheconditionaldensityisproportionaltoh(jy;; ;).ToobtaintheconditionaldensitywemustnormalizehbydividingbythevalueoftheintegralL(; ;jy)=ZRqh(jy;; ;)d:(6)Wewritethevalueoftheintegral(6)asL(; ;jy)becauseitisexactlythelikelihoodoftheparameters, and,giventheobserveddatay.Themaximumlikelihood(ML)estimatesoftheseparametersarethevaluesthatmaximizeL.2.5DeterminingtheMLestimatesThegeneralproblemofmaximizingL(; ;jy)withrespectto, andcanbeformidablebecauseeachevaluationofthisfunctioninvolvesapo-tentiallyhigh-dimensionalintegralandbecausethedimensionof canbelarge.However,thisgeneraloptimizationproblemcanbesplitintomanage-ablesubproblems.Givenavalueofwecandeterminetheconditionalmode,~(),ofandtheconditionalestimate,~ ()simultaneouslyusingpenalized,iterativelyre-weightedleastsquares(PIRLS).Theconditionalmodeandtheconditionalestimatearede nedas~()~ ()=argmaxu; h(jy;; ;):(7)(Itmaylookasifwehavemissedthedependenceonontheleft-handsidebutitturnsoutthatthescaleparameterdoesnota ectthelocationoftheoptimalvaluesofquantitiesinthelinearpredictor.)5 Asiscommoninsuchoptimizationproblems,were-expressthecondi-tionaldensityonthedeviancescale,whichisnegativetwicethelogarithmofthedensity,wheretheoptimizationbecomes~()~ ()=argminu; 2log(h(jy;; ;)):(8)ItisthisoptimizationproblemthatcanbesolvedquiteecientlyusingPIRLS.Infact,forlinearmixedmodels,whicharedescribedinthenextsection,~()and~ ()canbedirectlyevaluated.Thesecond-orderTaylorseriesexpansionof2loghat~()and~ ()providestheLaplaceapproximationtothepro leddeviance.OptimizingthisfunctionwithrespecttoprovidestheMLestimatesof,fromwhichtheMLestimatesof and(ifused)arederived.3MethodsforlinearmixedmodelsAsindicatedintheintroduction,acriticalstepinourmethodsfordetermin-ingthemaximumlikelihoodestimatesoftheparametersinamixedmodelissolvingapenalized,weightedleastsquares(PWLS)problem.WewillmotivatethegeneralformofthePWLSproblemby rstconsideringcom-putationalmethodsforlinearmixedmodelsthatresultinapenalizedleastsquares(PLS)problem.Recallfromx2.2that,inalinearmixedmodel,boththeconditionaldistribution,(YjU=),andtheunconditionaldistribution,U,aresphericalGaussiandistributionsandthattheconditionalmean,YjU(),isthelinearpredictor,().Becauseallthedistributionsdeterminingthemodelarecontinuousdistributions,weconsidertheirdensities.Onthedeviancescaletheseare2log(fU())=qlog(22)+2 22log(fYjU(yj))=nlog(22)+yZ()X 2 22log(h(jy;; ;))=(n+q)log(22)+y(;; )2+2 2=(n+q)log(22)+d(jy;; ) 2(9)6 In(9)thediscrepancyfunction,d(jy;; )=y(;; )2+2(10)hastheformofapenalizedresidualsumofsquaresinthatthe rstterm,y(;; )2istheresidualsumofsquaresfory,,and andthesecondterm,2,isapenaltyonthesizeof.Noticethatthediscrepancydoesnotdependonthecommonscaleparameter,.3.1ThecanonicalformofthediscrepancyUsingaso-called\pseudodata"representation,wecanwritethediscrepancyasaresidualsumofsquaresforaregressionmodelthatislinearinbothand d(jy;; )=\r\r\r\ry0Z()XIq0 \r\r\r\r2:(11)Theterm\pseudodata"re\rectsthefactthatwehaveaddedq\pseudoobser-vations"totheobservedresponse,y,andtothelinearpredictor,(;; )=Z()+X ,insuchawaythattheircontributiontotheoverallresidualsumofsquaresisexactlythepenaltyterminthediscrepancy.Intheform(11)wecanseethatthediscrepancyisaquadraticforminbothand .Furthermore,becausewerequirethatXhasfullcolumnrank,thediscrepancyisapositive-de nitequadraticforminand thatisminimizedat~()and~ ()satisfying0()Z0Z()+Iq0()Z0XX0Z()X0X~()~ ()=0()Z0yX0y(12)Ane ectivewayofdeterminingthesolutiontoasparse,symmetric,pos-itivede nitesystemofequationssuchas(12)isthesparseCholeskyde-composition(Davis,2006).IfAisasparse,symmetricpositivede nitematrixthenthesparseCholeskyfactorwith ll-reducingpermutationPisthelower-triangularmatrixLsuchthatLL0=PAP0:(13)(Technically,thefactorLisonlydetermineduptochangesinthesignofthediagonalelements.Byconventionwerequirethediagonalelementstobepositive.)7 The ll-reducingpermutationrepresentedbythepermutationmatrixP,whichisdeterminedfromthepatternofnonzerosinAbutdoesnotdependonparticularvaluesofthosenonzeros,canhaveaprofoundimpactonthenumberofnonzerosinLandhenceonthespeedwithwhichLcanbecalculatedfromA.InmostapplicationsoflinearmixedmodelsthematrixZ()issparsewhileXisdenseorclosetoitsothepermutationmatrixPcanberestrictedtotheformP=PZ00PX(14)withoutlossofeciency.Infact,inmostcaseswecansetPX=Ipwithoutlossofeciency.Letusassumethatthepermutationmatrixisrequiredtobeoftheform(14)sothatwecanwritetheCholeskyfactorizationforthepositivede nitesystem(12)asLZ0LXZLXLZ0LXZLX0=PZ00PX0()Z0Z()+Iq0()Z0XX0Z()X0XPZ00PX0:(15)Thediscrepancycannowbewritteninthecanonicalformd(jy;; )=~d(y;)+\r\r\r\rL0ZL0XZ0L0XPZ(~)PX( ~ )\r\r\r\r2(16)where~d(y;)=d(~()jy;;~ ())(17)istheminimumdiscrepancy,given.3.2Thepro ledlikelihoodforlinearmixedmodelsSubstituting(16)into(9)providestheunnormalizedconditionaldensityh(jy;; ;)onthedeviancescaleas2log(h(jy;; ;))=(n+q)log(22)+~d(y;)+\r\r\r\rL0ZL0XZ0L0XPZ(~)PX( ~ )\r\r\r\r2 2:(18)8 AsshowninAppendixB,theintegralofaquadraticformonthedeviancescale,suchas(18),iseasilyevaluated,providingthelog-likelihood,`(; ;jy),as2`(; ;jy)=2log(L(; ;jy))=nlog(22)+log(jLZj2)+~d(y;)+\r\r\rL0XPX( ~ )\r\r\r2 2;(19)fromwhichwecanseethattheconditionalestimateof ,given,is~ ()andtheconditionalestimateof,given,is~2()=~d(jy) n:(20)Substitutingtheseconditionalestimatesinto(19)producesthepro ledlike-lihood,~L(jy),as2~`(jy))=log(jLZ()j2)+n 1+log 2~d(y;) n!!:(21)ThemaximumlikelihoodestimateofcanthenbeexpressedasbL=argmin2~`(jy):(22)fromwhichtheMLestimatesof2and areevaluatedas2L=~d(bL;y) n(23)b L=~ (bL):(24)Theimportantthingtonoteaboutoptimizingthepro ledlikelihood,(21),isthatitisam-dimensionaloptimizationproblemandtypicallymisverysmall.3.3TheREMLcriterionInpracticetheso-calledREMLestimatesofvariancecomponentsareoftenpreferredtothemaximumlikelihoodestimates.(\REML"canbeconsid-eredtobeanacronymfor\restricted"or\residual"maximumlikelihood,9 althoughneithertermiscompletelyaccuratebecausetheseestimatesdonotmaximizealikelihood.)WecanmotivatetheuseoftheREMLcriterionbyconsideringalinearregressionmodel,YN(X ;2In);(25)inwhichwetypicallyestimate2by2R=yXb 2 np(26)eventhoughthemaximumlikelihoodestimateof2is2L=yXb 2 n:(27)Theargumentforpreferring2Rto2Lasanestimateof2isthatthenumeratorinbothestimatesisthesumofsquaredresidualsatb and,al-thoughtheresidualvectoryX isann-dimensionalvector,theresidualatbsatis esplinearlyindependentconstraints,X0(yXb )=0.Thatis,theresidualatbistheprojectionoftheobservedresponsevector,y,intoan(np)-dimensionallinearsubspaceofthen-dimensionalresponsespace.Theestimate2Rtakesintoaccountthefactthat2isestimatedfromresidualsthathaveonlynpdegreesoffreedom.TheREMLcriterionfordeterminingparameterestimatesbRand2Rinalinearmixedmodelhasthepropertythattheseestimateswouldspecializeto2Rfrom(26)foralinearregressionmodel.Althoughnotusuallyderivedinthisway,theREMLcriterioncanbeexpressedascR(;jy)=2logZRpL(jy;; ;)d (28)onthedeviancescale.TheREMLestimatesbRand2RminimizecR(;jy).Thepro ledREMLcriterion,afunctionofonly,is~cR(jy)=log(jLZ()j2jLX()j2)+(np) 1+log 2~d(jy) np!!(29)andtheREMLestimateofisbR=argmin~cR(;y):(30)10 TheREMLestimateof2is2R=~d(bRjy)=(np).Itisnotentirelyclearhowonewouldde nea\REMLestimate"of becausetheREMLcriterion,cR(;jy),de nedin(28),doesnotdependon .However,itiscustomary(andnotunreasonable)touseb R=~ (bR)astheREMLestimateof .Notethatthepro ledREMLcriterioncanbeevaluatedfromasparseCholeskydecompositionlikethatin(15)butwithouttherequirementthatthepermutationcanbeappliedtothecolumnsofZ()separatelyfromthecolumnnsofX.Thatis,wecanuseageneral ll-reducingpermutationratherthanthespeci cform(14)withseparatepermutationsrepresentedbyPZandPX.ThiscanbeusefulincaseswherebothZandXarelargeandsparse.3.4SummaryforlinearmixedmodelsAlinearmixedmodelischaracterizedbytheconditionaldistribution(YjU=)N((;; );2In)where(;; )=Z()+X (31)andtheunconditionaldistributionUN(0;2Iq).Thediscrepancyfunc-tion,d(jy;; )=y(;; )2+2;isminimizedattheconditionalmode,~(),andtheconditionalestimate,~ (),whicharethesolutionstothesparse,positive-de nitelinearsystem0()Z0Z()+Iq0()Z0XX0Z()X0X~()~ ()=0()Z0yX0y:IntheprocessofsolvingthissystemwecreatethesparseleftCholeskyfactor,LZ(),whichisalowertriangularsparsematrixsatisfyingLZ()LZ()0=PZ(0()Z0Z()+Iq)P0ZwherePZisapermutationmatrixrepresentinga ll-reducingpermutationformedfromthepatternofnonzerosinZ()foranynotontheboundaryoftheparameterregion.(Thevaluesofthenonzerosdependonbutthepatterndoesn't.)11 Thepro ledlog-likelihood,~`(jy),is2~`(jy)=log(jLZ()j2)+n 1+log 2~d(y;) n!!where~d(y;)=d(~()jy;~ ();).4GeneralizingthediscrepancyfunctionBecauseoneofthefactorsin\ruencingthechoiceofimplementationforlin-earmixedmodelsistheextenttowhichthemethodscanalsobeappliedtoothermixedmodels,wedescribeseveralotherclassesofmixedmodelsbeforediscussingtheimplementationdetailsforlinearmixedmodels.Atthecoreofourmethodsfordeterminingthemaximumlikelihoodestimates(MLEs)oftheparametersinthemixedmodelaremethodsforminimizingthediscrep-ancyfunctionwithrespecttothecoecientsand inthelinearpredictor(;; ).Inthissectionwedescribethegeneralformofthediscrepancyfunctionthatwewilluseandapenalizediterativelyreweightedleastsquares(PIRLS)algorithmfordeterminingtheconditionalmodes~()and~ ().Wethendescribeseveraltypesofmixedmodelsandtheformofthediscrepancyfunc-tionforeach.4.1AweightedresidualsumofsquaresAsshowninx3.1,thediscrepancyfunctionforalinearmixedmodelhastheformofapenalizedresidualsumofsquaresfromalinearmodel(11).Inthissectionwegeneralizethatde nitiontod(jy;; )=\r\r1=2()yYjU(;; )\r\r2+02:(32)whereisannndiagonalmatrix,calledtheweightsmatrix,withpositivediagonalelementsand1=2isthediagonalmatrixwiththesquarerootsoftheweightsonthediagonal.Thethweightisinverselyproportionaltotheconditionalvariancesof(YjU=)andmaydependontheconditionalmean,YjU.12 Weallowtheconditionalmeantobeanonlinearfunctionofthelinearpredictor,butwithcertainrestrictions.WerequirethatthemappingfromtoYjU=ubeexpressedas(33)where=Z()+X isanns-dimensionalvector(s�0)whileandaren-dimensionalvectors.Themaphasthepropertythatidependsonlyoni,=1;:::;n.Themaphasasimilarpropertyinthat,ifwewriteasannsmatrixsuchthat=vec(34)(i.e.concatenatingthecolumnsofproduces)thenidependsonlyonthethrowof,=1;:::;n.ThustheJacobianmatrixd d0isannndiagonalmatrixandtheJacobianmatrixd d\r0isthehorizontalconcatenationofsdiagonalnnmatrices.Forhistoricalreasons,thefunctionthatmapsitoiiscalledtheinversefunctionandiswritten=g1().Thelinkfunction,naturally,is=g().Whenappliedcomponent-wisetovectorsorwewritetheseas=g()and=g1().Recallthattheconditionaldistribution,(YijU=),isrequiredtobeindependentof(YjjU=)fori;j=1;:::;n;i=jandthatallthecom-ponentconditionaldistributionsmustbeofthesameformanddi eronlyaccordingtothevalueoftheconditionalmean.Dependingonthefamilyoftheconditionaldistributions,theallowablevaluesoftheimaybeinarestrictedrange.Forexample,iftheconditionaldistributionsareBernoullithen0i1;i=1;:::;n.IftheconditionaldistributionsarePoissonthen0i;i=1;:::;n.Acharacteristicofthelinkfunction,g,isthatitmustmaptherestrictedrangetoanunrestrictedrange.Thatis,alinkfunctionfortheBernoullidistributionmustmap[0;1]to[1;]andmustbeinvertiblewithintherange.Themappingfromtoisde nedbyafunctionm:RsR,calledthenonlinearmodelfunction,suchthati=m(i);i=1;:::;nwhereiisthethrowof.Thevector-valuedfunctionis=m().Determiningtheconditionalmodes,~(yj),and~ (yj),thatjointlymin-imizethediscrepancy,~(yj)~ (yj)=argminu; (y)0(y)+2(35)13 becomesaweighted,nonlinearleastsquaresproblemexceptthattheweights,,candependonand,hence,onand .Indescribinganalgorithmforlinearmixedmodelswecalled~ ()theconditionalestimate.Thatnamere\rectsthatfactthatthisisthemaximumlikelihoodestimateof forthatparticularvalueof.Oncewehavedeter-minedtheMLE,b()Lof,wehavea\plug-in"estimator,b L=~ ()for .Thispropertydoesnotcarryoverexactlytootherformsofmixedmodels.Thevalues~()and~ ()areconditionalmodesinthesensethattheyarethecoecientsinthatjointlymaximizetheunscaledconditionaldensityh(jy;; ;).Hereweareusingtheadjective\conditional"moreinthesenseofconditioningonY=ythaninthesenseofconditioningon,althoughthesevaluesaredeterminedfora xedvalueof.4.2ThePIRLSalgorithmfor~and~ Thepenalized,iterativelyreweighted,leastsquares(PIRLS)algorithmtodetermine~()and~ ()isaformoftheFisherscoringalgorithm.We xtheweightsmatrix,,andusepenalized,weighted,nonlinearleastsquarestominimizethepenalized,weightedresidualsumofsquaresconditionalontheseweights.Thenweupdatetheweightstothosedeterminedbythecurrentvalueofanditerate.Todescribethisalgorithminmoredetailwewilluseparenthesizedsuper-scriptstodenotetheiterationnumber.Thus(0)and (0)aretheinitialval-uesoftheseparameters,while(i)and (i)arethevaluesatthethiteration.Similarly(i)=Z()(i)+X (i),(i)=m((i))and(i)=g1((i)).WeuseapenalizedversionoftheGauss-Newtonalgorithm(BatesandWatts,1988,ch.2)forwhichwede netheweightedJacobianmatricesU(i)=1=2d d0 u=u(i); = (i)=1=2d d0 (i)d d0 \r(i)Z()(36)V(i)=1=2d d 0 u=u(i); = (i)=1=2d d0 (i)d d0 \r(i)X(37)ofdimensionnqandnp,respectively.Theincrementsatthethiteration,(i)uand(i) ,arethesolutionstoU(i)0U(i)+IqU(i)0V(i)V(i)0U(i)V(i)0V(i)"(i)u(i) #=U(i)01=2(y(i))(i)V(i)01=2(y(i))(38)14 providingtheupdatedparametervalues(i+1) (i+1)=(i) (i)+"(i)u(i) #(39)where�0isastepfactorchosentoensurethat(y(i+1))0(y(i+1))+(i+1)2(y(i))0(y(i))+(i)2:(40)Intheprocessofsolvingfortheincrementsweformthesparse,lowertriangular,Choleskyfactor,L(i),satisfyingL(i)L(i)0=PZU(i)0U(i)+InP0Z:(41)Aftereachsuccessfuliteration,determiningnewvaluesofthecoecients,(i+1)and (i+1),thatreducethepenalized,weightedresidualsumofsquqres,weupdatetheweightsmatrixto((i+1))andtheweightedJacobians,U(i+1)andV(i+1),theniterate.Convergenceisdeterminedaccordingtotheorthogonalityconvergencecriterion(BatesandWatts,1988,ch.2),suitablyadjustedfortheweightsmatrixandthepenalty.4.3WeightedlinearmixedmodelsOneofthesimplestgeneralizationsoflinearmixedmodelsisaweightedlinearmixedmodelwheres=1,thelinkfunction,g,andthenonlinearmodelfunction,m,areboththeidentity,theweightsmatrix,,isconstantandtheconditionaldistributionfamilyisGaussian.Thatis,theconditionaldistributioncanbewritten(YjU=)N((;; );21)(42)withdiscrepancyfunctiond(jy;; )=\r\r1=2(yZ()X )\r\r2+2:(43)Theconditionalmode,~(),andtheconditionalestimate,~ (),arethesolutionsto0()Z0WZ()+Iq0()Z0WXX0WZ()X0WX~()~ ()=0()Z0WyX0Wy;(44)15 whichcanbesolveddirectly,andtheCholeskyfactor,LZ(),satis esLZ()LZ()0=PZ(0()Z0WZ()+Iq)P0Z:(45)Thepro ledlog-likelihood,~`(jy),is2~`(jy)=logjLZ()j2 jj+n 1+log 2~d(y;) n!!:(46)Ifthematrixis xedthenwecanignorethetermjjin(46)whendeterminingtheMLE,bL.However,insomemodels,weuseaparameterizedweightmatrix,(),andwishtodeterminetheMLEs,bLandbLsimul-taneously.Inthesecaseswemustincludetheterminvolvingj()jwhenevaluatingthepro ledlog-likelihood.Notethatwemustde netheparameterizationof()suchthat2andarenotaredundantparameterizationof2().Forexample,wecouldrequirethatthe rstdiagonalelementofbeunity.4.4NonlinearmixedmodelsInanunweighted,nonlinearmixedmodeltheconditionaldistributionisGaussian,thelink,g,istheidentityandtheweightsmatrix,=In.Thatis,(YjU=)N(m();2In)(47)withdiscrepancyfunctiond(jy;; )=y2+2:(48)Foragivenvalueofwedeterminetheconditionalmodes,~()and~ (),asthesolutiontothepenalizednonlinearleastsquaresproblem~()~ ()=argminu;d(jy;; )(49)andwewritetheminimumdiscrepancy,givenyand,as~d(y;)=d(~()jy;;~ ()):(50)16 Let~LZ()and~LX()betheCholeskyfactorsat,~ ()and~().ThentheLaplaceapproximationtothelog-likelihoodis2`P(; ;jy)nlog(22)+log(j~LZj2)+~d(y;)+\r\r\r~L0X( ~ )\r\r\r2 2;(51)producingtheapproximatepro ledlog-likelihood,~`P(jy),2~`P(jy)log(j~LZj2)+n1+log(2~d(y;)=n):(52)4.4.1NonlinearmixedmodelsummaryInanonlinearmixedmodelwedeterminetheparameterestimate,bP,fromtheLaplaceapproximationtothelog-likelihoodasbP=argmax~`P(jy)=argminlog(j~LZj2)+n1+log(2~d(y;)=n):(53)Eachevaluationof~`P(jy)requiresasolvingthepenalizednonlinearleastsquaresproblem(49)simultaneouslywithrespecttobothsetsofcoecients,and ,inthelinearpredictor,.Foraweightednonlinearmixedmodelwith xedweights,,wereplacetheunweighteddiscrepancyfunctiond(jy;; )withtheweighteddiscrep-ancyfunction,5Detailsoftheimplementation5.1ImplementationdetailsforlinearmixedmodelsThecrucialstepinimplementingalgorithmsfordeterminingMLorREMLestimatesoftheparametersinalinearmixedmodelisevaluatingthefac-torization(15)foranysatisfyingLU.WewillassumethatZissparseasisZ().WhenXisnotsparsewewillusethefactorization(15)settingPX=IpandstoringLXZandLXasdensematrices.ThepermutationmatrixPZisdeterminedfromthepatternofnon-zerosinZ()whichisdoesnotdependon,aslongasisnotontheboundary.Infact,inmostcasesthepatternofnon-zerosinZ()isthesameasthepatternofnon-zerosinZ.17 Formanymodels,inparticularmodelswithscalarrandome ects(describedlater),thematrix()isdiagonal.GivenavalueofwedeterminetheCholeskyfactorLZsatisfyingLZL0Z=PZ(0()Z0Z()+Iq)P0Z:(54)TheCHOLMODpackageallowsforLZtobecalculateddirectlyfrom0()Z0orfrom0()Z0Z().ThechoiceinimplementationiswhethertostoreZ0andupdateitto0()ZortostoreZ0Zanduseittoform0()Z0Z()ateachevaluation.Inthelme4packagewestoreZ0anduseittoform0()Z0fromwhichLZisevaluated.Therearetworeasonsforthischoice.First,thecalculationsforthemoregeneralformsofmixedmodelscannotbereducedtocalculationsinvolvingZ0Zandbyexpressingthesecalculationsintermsof()Z0forlinearmixedmodelswecanreusethecodeforthemoregeneralmodels.Sec-ond,thecalculationof()0(Z0Z)()fromZ0Ziscomplicatedcomparedtothecalculationof()0Z0fromZ0.ThischoiceisdisadvantageouswhennqbecauseZ0ismuchlargerthanZ0Z,evenwhentheyarestoredassparsematrices.EvaluationofLZdirectlyfromZ0requiresmorestorageandmorecalculationthatevaluatingLZfromZ0Z.NextweevaluateL0XZasthesolutiontoLZL0XZ=PZ0()Z0X:(55)AgainwehavethechoiceofcalculatingandstoringZ0XorstoringXandusingittoreevaluateZ0X.Inthelme4packagewestoreX,becausethecalculationsforthemoregeneralmodelscannotbeexpressedintermsofZ0X.FinallyLXisevaluatedasthe(dense)solutiontoLXL0X=X0XLXZLXZ:(56)fromwhich~ canbedeterminedasthesolutiontodensesystemLXLX~ =X0y(57)and~asthesolutiontothesparsesystemLZLZ~u=0Z0y(58)18 Formanymodels,inparticularmodelswithscalarrandome ects,whicharedescribedlater,thematrix()isdiagonal.Forsuchamodel,ifbothZandXaresparseandweplantousetheREMLcriterionthenwecreateandstoreA=Z0ZZ0XX0ZX0Xandc=Z0yX0y(59)anddeterminea ll-reducingpermutation,P,forA.GivenavalueofwecreatethefactorizationL()L()0=P()00IpA()00Ip+Iq000P0(60)solvefor~()and~ ()inLL0P~()~ ()=P()00Ipc(61)thenevaluate~d(yj)andthepro ledREMLcriterionas~dR(jy)=log(jL()j2)+(np) 1+log 2~d(yj) np!!:(62)ReferencesDouglasM.BatesandDonaldG.Watts.NonlinearRegressionAnalysisandItsApplications.Wiley,1988.TimDavis.CHOLMOD:sparsesupernodalCholeskyfactorizationandup-date/downdate.http://www.cise.u\r.edu/research/sparse/cholmod,2005.TimothyA.Davis.DirectMethodsforSparseLinearSystems.FundamentalsofAlgorithms.SIAM,2006.ANotationA.1RandomvariablesinthemodelBRandom-e ectsvectorofdimensionq,BN(0;2()()0).19 U\Spherical"random-e ectsvectorofdimensionq,UN(0;2Iq),B=()U.YResponsevectorofdimensionn.A.2Parametersofthemodel Fixed-e ectsparameters(dimensionp).Parametersdeterminingtheleftfactor,()oftherelativecovariancematrixofB(dimensionm).thecommonscaleparameter-notusedinsomegeneralizedlinearmixedmodelsandgeneralizednonlinearmixedmodels.A.3Dimensionsmdimensionoftheparameter.ndimensionoftheresponsevector,y,andtherandomvariable,Y.pdimensionofthe xed-e ectsparameter, .qdimensionoftherandome ects,BorU.sdimensionoftheparametervector,,inthenonlinearmodelfunction.A.4MatricesLLeftCholeskyfactorofapositive-de nitesymmetricmatrix.LZisqq;LXispp.PFill-reducingpermutationfortherandome ectsmodelmatrix.(Sizeqq.)Leftfactoroftherelativecovariancematrixoftherandome ects.(Sizeqq.)XModelmatrixforthe xed-e ectsparameters, .(Size(ns)p.)ZModelmatrixfortherandome ects.(Size(ns)q.)20 BIntegratingaquadraticdevianceexpres-sionIn(6)wede nedthelikelihoodoftheparametersgiventheobserveddataasL(; ;jy)=ZRqh(jy;; ;)d:whichisoftenalarminglydescribedas\anintractableintegral".Inpointoffact,thisintegralcanbeevaluatedexactlyinthecaseofalinearmixedmodelandcanbeapproximatedquiteaccuratelyforotherformsofmixedmodels.21