In this vignette we describe the formulation of these models and the compu tational approach used to evaluate or approximate the loglikelihood of a modeldataparameter value combination 1 Introduction The lme4 package provides functions to 64257t and ID: 23065
Download Pdf The PPT/PDF document "Computational methods for mixed models D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
ComputationalmethodsformixedmodelsDouglasBatesDepartmentofStatisticsUniversityofWisconsin{MadisonOctober13,2020AbstractThelme4packageprovidesRfunctionstotandanalyzeseveraldierenttypesofmixed-eectsmodels,includinglinearmixedmodels,generalizedlinearmixedmodelsandnonlinearmixedmodels.Inthisvignettewedescribetheformulationofthesemodelsandthecompu-tationalapproachusedtoevaluateorapproximatethelog-likelihoodofamodel/data/parametervaluecombination.1IntroductionThelme4packageprovidesfunctionstotandanalyzelinearmixedmodels,generalizedlinearmixedmodelsandnonlinearmixedmodels.Thesemodelsarecalledmixed-eectsmodelsor,moresimply,mixedmodelsbecausetheyincorporatebothxed-eectsparameters,whichapplytoanentirepopula-tionortocertainwell-denedandrepeatablesubsetsofapopulation,andrandomeects,whichapplytotheparticularexperimentalunitsorobser-vationalunitsinthestudy.Suchmodelsarealsocalledmodelsbecausetherandomeectsrepresentlevelsofvariationinadditiontotheper-observationnoisetermthatisincorporatedincommonstatisticalmod-elssuchaslinearregressionmodels,generalizedlinearmodelsandnonlinearregressionmodels.Webeginbydescribingcommonpropertiesofthesemixedmodelsandthegeneralcomputationalapproachusedinthelme4package.Theestimatesoftheparametersinamixedmodelaredeterminedasthevaluesthatoptimize1 anobjectivefunction|eitherthelikelihoodoftheparametersgiventheobserveddata,formaximumlikelihood(ML)estimates,orarelatedobjectivefunctioncalledtheREMLcriterion.Becausethisobjectivefunctionmustbeevaluatedatmanydierentvaluesofthemodelparametersduringtheoptimizationprocess,wefocusontheevaluationoftheobjectivefunctionandacriticalcomputationinthisevalution|determiningthesolutiontoapenalized,weightedleastsquares(PWLS)problem.ThedimensionofthesolutionofthePWLSproblemcanbeverylarge,perhapsinthemillions.Furthermore,suchproblemsmustbesolvedrepeat-edlyduringtheoptimizationprocesstodetermineparameterestimates.ThewholeapproachwouldbeinfeasiblewereitnotforthefactthatthematricesdeterminingthePWLSproblemaresparseandwecanusesparsematrixstorageformatsandsparsematrixcomputations(Davis,2006).Inparticu-lar,thewholecomputationalapproachhingesontheextraordinarilyecientmethodsfordeterminingtheCholeskydecompositionofsparse,symmetric,positive-denitematricesembodiedintheCHOLMODlibraryofCfunctions(Davis,2005).Inthenextsectionwedescribethegeneralformofthemixedmodelsthatcanberepresentedinthelme4packageandthecomputationalapproachembodiedinthepackage.Inthefollowingsectionwedescribeaparticularformofmixedmodel,calledalinearmixedmodel,andthecomputationaldetailsforthosemodels.Inthefourthsectionwedescribecomputationalmethodsforgeneralizedlinearmixedmodels,nonlinearmixedmodelsandgeneralizednonlinearmixedmodels.2FormulationofmixedmodelsAmixed-eectsmodelincorporatestwovector-valuedrandomvariables:then-dimensionalresponsevector,Y,andtheq-dimensionalrandomeectsvec-tor,B.Weobservethevalue,y,ofY.WedonotobservethevalueofB.TherandomvariableYmaybecontinuousordiscrete.Thatis,theobserveddata,y,maybeonacontinuousscaleortheymaybeonadiscretescale,suchasbinaryresponsesorresponsesrepresentingacount.Inourformulation,therandomvariableBisalwayscontinous.WespecifyamixedmodelbydescribingtheunconditionaldistributionofBandtheconditionaldistribution(YjB=b).2 2.1TheunconditionaldistributionofBInourformulation,theunconditionaldistributionofBisalwaysaq-dimensionalmultivariateGaussian(or\normal")distributionwithmean0andwithapa-rameterizedcovariancematrix,BN 0;2()0():(1)Thescalar,,in(1),iscalledthecommonscaleparameter.Aswewillseelater,notalltypesofmixedmodelsincorporatethisparameter.Wewillinclude2inthegeneralformoftheunconditionaldistributionofBwiththeunderstandingthat,insomemodels,1.Theqqmatrix(),whichisaleftfactorofthecovariancematrix(when=1)ortherelativecovariancematrix(when=1),dependsonanm-dimensionalparameter.Typicallymq;intheexamplesweshowbelowitisalwaysthecasethatm5,evenwhenqisinthethousands.Thefactthatmisverysmallisimportantbecause,asweshallsee,deter-miningtheparameterestimatesinamixedmodelcanbeexpressedasanoptimizationproblemwithrespecttoonly.Theparametermaybe,andtypicallyis,subjecttoconstraints.Foreaseofcomputation,werequirethattheconstraintsbeexpressedas\box"constraintsoftheformiLiiU;i=1;:::;mforconstantsiLandiU;i=1;:::;m.WeshallwritethesetofsuchconstraintsasLR.Thematrix()isrequiredtobenon-singular(i.e.invertible)whenisnotontheboundary.2.2Theconditionaldistribution,(YjB=b)Theconditionaldistribution,(YjB=b),mustsatisfy:1.Theconditionalmean,YjB(b)=E[YjB=b],dependsonbonlythroughthevalueofthelinearpredictor,Zb+X,whereisthep-dimensionalxed-eectsparametervectorandthemodelmatrices,ZandX,arexedmatricesoftheappropriatedimension.Thatis,thetwomodelmatricesmusthavethesamenumberofrowsandmusthaveqandpcolumns,respectively.ThenumberofrowsinZandXisamultipleofn,thedimensionofy.2.Thescalardistributions,(YijB=b);i=1;:::;n,allhavethesameformandarecompletelydeterminedbytheconditionalmean,YjB(b)3 and,atmost,oneadditionalparameter,,whichisthecommonscaleparameter.3.Thescalardistributions,(YijB=b);i=1;:::;n,areindependent.Thatis,thecomponentsofYareconditionallyindependentgivenB.Animportantspecialcaseoftheconditionaldistributionisthemultivari-ateGaussiandistributionoftheform(YjB=b)N(Zb+X;2In)(2)whereIndenotestheidentitymatrixofsizen.Inthiscasetheconditionalmean,YjB(b),isexactlythelinearpredictor,Zb+X,asituationwewilllaterdescribeasbeingan\identitylink"betweentheconditionalmeanandthelinearpredictor.Modelswithconditionaldistribution(2)arecalledlinearmixedmodels.2.3Achangeofvariableto\spherical"randomeectsBecausetheconditionaldistribution(YjB=b)dependsonbonlythroughthelinearpredictor,itiseasytoexpressthemodelintermsofalineartrans-formationofB.Wedenethelineartransformationfromaq-dimensional\spherical"Gaussianrandomvariable,U,toBasB=()U;UN(0;2Iq):(3)(Theterm\spherical"referstothefactthatcontoursofconstantprobabilitydensityforUarespherescenteredatthemean|inthiscase,0.)Whenisnotontheboundarythisisaninvertibletransformation.Whenisontheboundarythetransformationcanfailtobeinvertible.However,wewillonlyneedtobeabletoexpressBintermsofUandthattransformationiswell-dened,evenwhenisontheboundary.Thelinearpredictor,asafunctionof,is()=Z()+X:(4)Whenwewishtoemphasizetheroleofthemodelparameters,and,intheformulationof,wewillwritethelinearpredictoras(;;).4 2.4Theconditionaldensity(UjY=y)Becauseweobserveyanddonotobservebor,theconditionaldistributionofinterest,forthepurposesofstatisticalinference,is(UjY=y)(or,equiv-alently,(BjY=y)).ThisconditionaldistributionisalwaysacontinuousdistributionwithconditionalprobabilitydensityfUjY(jy).WecanevaluatefUjY(jy),uptoaconstant,astheproductoftheuncon-ditionaldensity,fU(),andtheconditionaldensity(ortheprobabilitymassfunction,whicheverisappropriate),fYjU(yj).Wewritethisunnormalizedconditionaldensityash(jy;;;)=fYjU(yj;;;)fU(j):(5)Wesaythathisthe\unnormalized"conditionaldensitybecauseallweknowisthattheconditionaldensityisproportionaltoh(jy;;;).ToobtaintheconditionaldensitywemustnormalizehbydividingbythevalueoftheintegralL(;;jy)=ZRqh(jy;;;)d:(6)Wewritethevalueoftheintegral(6)asL(;;jy)becauseitisexactlythelikelihoodoftheparameters,and,giventheobserveddatay.Themaximumlikelihood(ML)estimatesoftheseparametersarethevaluesthatmaximizeL.2.5DeterminingtheMLestimatesThegeneralproblemofmaximizingL(;;jy)withrespectto,andcanbeformidablebecauseeachevaluationofthisfunctioninvolvesapo-tentiallyhigh-dimensionalintegralandbecausethedimensionofcanbelarge.However,thisgeneraloptimizationproblemcanbesplitintomanage-ablesubproblems.Givenavalueofwecandeterminetheconditionalmode,~(),ofandtheconditionalestimate,~()simultaneouslyusingpenalized,iterativelyre-weightedleastsquares(PIRLS).Theconditionalmodeandtheconditionalestimatearedenedas~()~()=argmaxu;h(jy;;;):(7)(Itmaylookasifwehavemissedthedependenceonontheleft-handsidebutitturnsoutthatthescaleparameterdoesnotaectthelocationoftheoptimalvaluesofquantitiesinthelinearpredictor.)5 Asiscommoninsuchoptimizationproblems,were-expressthecondi-tionaldensityonthedeviancescale,whichisnegativetwicethelogarithmofthedensity,wheretheoptimizationbecomes~()~()=argminu; 2log(h(jy;;;)):(8)ItisthisoptimizationproblemthatcanbesolvedquiteecientlyusingPIRLS.Infact,forlinearmixedmodels,whicharedescribedinthenextsection,~()and~()canbedirectlyevaluated.Thesecond-orderTaylorseriesexpansionof 2loghat~()and~()providestheLaplaceapproximationtotheproleddeviance.OptimizingthisfunctionwithrespecttoprovidestheMLestimatesof,fromwhichtheMLestimatesofand(ifused)arederived.3MethodsforlinearmixedmodelsAsindicatedintheintroduction,acriticalstepinourmethodsfordetermin-ingthemaximumlikelihoodestimatesoftheparametersinamixedmodelissolvingapenalized,weightedleastsquares(PWLS)problem.WewillmotivatethegeneralformofthePWLSproblembyrstconsideringcom-putationalmethodsforlinearmixedmodelsthatresultinapenalizedleastsquares(PLS)problem.Recallfromx2.2that,inalinearmixedmodel,boththeconditionaldistribution,(YjU=),andtheunconditionaldistribution,U,aresphericalGaussiandistributionsandthattheconditionalmean,YjU(),isthelinearpredictor,().Becauseallthedistributionsdeterminingthemodelarecontinuousdistributions,weconsidertheirdensities.Onthedeviancescaletheseare 2log(fU())=qlog(22)+2 2 2log(fYjU(yj))=nlog(22)+y Z() X2 2 2log(h(jy;;;))=(n+q)log(22)+y (;;)2+2 2=(n+q)log(22)+d(jy;;) 2(9)6 In(9)thediscrepancyfunction,d(jy;;)=y (;;)2+2(10)hastheformofapenalizedresidualsumofsquaresinthattherstterm,y (;;)2istheresidualsumofsquaresfory,,andandthesecondterm,2,isapenaltyonthesizeof.Noticethatthediscrepancydoesnotdependonthecommonscaleparameter,.3.1ThecanonicalformofthediscrepancyUsingaso-called\pseudodata"representation,wecanwritethediscrepancyasaresidualsumofsquaresforaregressionmodelthatislinearinbothandd(jy;;)=\r\r\r\ry0 Z()XIq0\r\r\r\r2:(11)Theterm\pseudodata"re\rectsthefactthatwehaveaddedq\pseudoobser-vations"totheobservedresponse,y,andtothelinearpredictor,(;;)=Z()+X,insuchawaythattheircontributiontotheoverallresidualsumofsquaresisexactlythepenaltyterminthediscrepancy.Intheform(11)wecanseethatthediscrepancyisaquadraticforminbothand.Furthermore,becausewerequirethatXhasfullcolumnrank,thediscrepancyisapositive-denitequadraticforminandthatisminimizedat~()and~()satisfying0()Z0Z()+Iq0()Z0XX0Z()X0X~()~()=0()Z0yX0y(12)Aneectivewayofdeterminingthesolutiontoasparse,symmetric,pos-itivedenitesystemofequationssuchas(12)isthesparseCholeskyde-composition(Davis,2006).IfAisasparse,symmetricpositivedenitematrixthenthesparseCholeskyfactorwithll-reducingpermutationPisthelower-triangularmatrixLsuchthatLL0=PAP0:(13)(Technically,thefactorLisonlydetermineduptochangesinthesignofthediagonalelements.Byconventionwerequirethediagonalelementstobepositive.)7 Thell-reducingpermutationrepresentedbythepermutationmatrixP,whichisdeterminedfromthepatternofnonzerosinAbutdoesnotdependonparticularvaluesofthosenonzeros,canhaveaprofoundimpactonthenumberofnonzerosinLandhenceonthespeedwithwhichLcanbecalculatedfromA.InmostapplicationsoflinearmixedmodelsthematrixZ()issparsewhileXisdenseorclosetoitsothepermutationmatrixPcanberestrictedtotheformP=PZ00PX(14)withoutlossofeciency.Infact,inmostcaseswecansetPX=Ipwithoutlossofeciency.Letusassumethatthepermutationmatrixisrequiredtobeoftheform(14)sothatwecanwritetheCholeskyfactorizationforthepositivedenitesystem(12)asLZ0LXZLXLZ0LXZLX0=PZ00PX0()Z0Z()+Iq0()Z0XX0Z()X0XPZ00PX0:(15)Thediscrepancycannowbewritteninthecanonicalformd(jy;;)=~d(y;)+\r\r\r\rL0ZL0XZ0L0XPZ( ~)PX( ~)\r\r\r\r2(16)where~d(y;)=d(~()jy;;~())(17)istheminimumdiscrepancy,given.3.2TheproledlikelihoodforlinearmixedmodelsSubstituting(16)into(9)providestheunnormalizedconditionaldensityh(jy;;;)onthedeviancescaleas 2log(h(jy;;;))=(n+q)log(22)+~d(y;)+\r\r\r\rL0ZL0XZ0L0XPZ( ~)PX( ~)\r\r\r\r2 2:(18)8 AsshowninAppendixB,theintegralofaquadraticformonthedeviancescale,suchas(18),iseasilyevaluated,providingthelog-likelihood,`(;;jy),as 2`(;;jy)= 2log(L(;;jy))=nlog(22)+log(jLZj2)+~d(y;)+\r\r\rL0XPX( ~)\r\r\r2 2;(19)fromwhichwecanseethattheconditionalestimateof,given,is~()andtheconditionalestimateof,given,is~2()=~d(jy) n:(20)Substitutingtheseconditionalestimatesinto(19)producestheproledlike-lihood,~L(jy),as 2~`(jy))=log(jLZ()j2)+n 1+log 2~d(y;) n!!:(21)ThemaximumlikelihoodestimateofcanthenbeexpressedasbL=argmin 2~`(jy):(22)fromwhichtheMLestimatesof2andareevaluatedas2L=~d(bL;y) n(23)bL=~(bL):(24)Theimportantthingtonoteaboutoptimizingtheproledlikelihood,(21),isthatitisam-dimensionaloptimizationproblemandtypicallymisverysmall.3.3TheREMLcriterionInpracticetheso-calledREMLestimatesofvariancecomponentsareoftenpreferredtothemaximumlikelihoodestimates.(\REML"canbeconsid-eredtobeanacronymfor\restricted"or\residual"maximumlikelihood,9 althoughneithertermiscompletelyaccuratebecausetheseestimatesdonotmaximizealikelihood.)WecanmotivatetheuseoftheREMLcriterionbyconsideringalinearregressionmodel,YN(X;2In);(25)inwhichwetypicallyestimate2by2R=y Xb2 n p(26)eventhoughthemaximumlikelihoodestimateof2is2L=y Xb2 n:(27)Theargumentforpreferring2Rto2Lasanestimateof2isthatthenumeratorinbothestimatesisthesumofsquaredresidualsatband,al-thoughtheresidualvectory Xisann-dimensionalvector,theresidualatbsatisesplinearlyindependentconstraints,X0(y Xb)=0.Thatis,theresidualatbistheprojectionoftheobservedresponsevector,y,intoan(n p)-dimensionallinearsubspaceofthen-dimensionalresponsespace.Theestimate2Rtakesintoaccountthefactthat2isestimatedfromresidualsthathaveonlyn pdegreesoffreedom.TheREMLcriterionfordeterminingparameterestimatesbRand2Rinalinearmixedmodelhasthepropertythattheseestimateswouldspecializeto2Rfrom(26)foralinearregressionmodel.Althoughnotusuallyderivedinthisway,theREMLcriterioncanbeexpressedascR(;jy)= 2logZRpL(jy;;;)d(28)onthedeviancescale.TheREMLestimatesbRand2RminimizecR(;jy).TheproledREMLcriterion,afunctionofonly,is~cR(jy)=log(jLZ()j2jLX()j2)+(n p) 1+log 2~d(jy) n p!!(29)andtheREMLestimateofisbR=argmin~cR(;y):(30)10 TheREMLestimateof2is2R=~d(bRjy)=(n p).Itisnotentirelyclearhowonewoulddenea\REMLestimate"ofbecausetheREMLcriterion,cR(;jy),denedin(28),doesnotdependon.However,itiscustomary(andnotunreasonable)tousebR=~(bR)astheREMLestimateof.NotethattheproledREMLcriterioncanbeevaluatedfromasparseCholeskydecompositionlikethatin(15)butwithouttherequirementthatthepermutationcanbeappliedtothecolumnsofZ()separatelyfromthecolumnnsofX.Thatis,wecanuseageneralll-reducingpermutationratherthanthespecicform(14)withseparatepermutationsrepresentedbyPZandPX.ThiscanbeusefulincaseswherebothZandXarelargeandsparse.3.4SummaryforlinearmixedmodelsAlinearmixedmodelischaracterizedbytheconditionaldistribution(YjU=)N((;;);2In)where(;;)=Z()+X(31)andtheunconditionaldistributionUN(0;2Iq).Thediscrepancyfunc-tion,d(jy;;)=y (;;)2+2;isminimizedattheconditionalmode,~(),andtheconditionalestimate,~(),whicharethesolutionstothesparse,positive-denitelinearsystem0()Z0Z()+Iq0()Z0XX0Z()X0X~()~()=0()Z0yX0y:IntheprocessofsolvingthissystemwecreatethesparseleftCholeskyfactor,LZ(),whichisalowertriangularsparsematrixsatisfyingLZ()LZ()0=PZ(0()Z0Z()+Iq)P0ZwherePZisapermutationmatrixrepresentingall-reducingpermutationformedfromthepatternofnonzerosinZ()foranynotontheboundaryoftheparameterregion.(Thevaluesofthenonzerosdependonbutthepatterndoesn't.)11 Theproledlog-likelihood,~`(jy),is 2~`(jy)=log(jLZ()j2)+n 1+log 2~d(y;) n!!where~d(y;)=d(~()jy;~();).4GeneralizingthediscrepancyfunctionBecauseoneofthefactorsin\ruencingthechoiceofimplementationforlin-earmixedmodelsistheextenttowhichthemethodscanalsobeappliedtoothermixedmodels,wedescribeseveralotherclassesofmixedmodelsbeforediscussingtheimplementationdetailsforlinearmixedmodels.Atthecoreofourmethodsfordeterminingthemaximumlikelihoodestimates(MLEs)oftheparametersinthemixedmodelaremethodsforminimizingthediscrep-ancyfunctionwithrespecttothecoecientsandinthelinearpredictor(;;).Inthissectionwedescribethegeneralformofthediscrepancyfunctionthatwewilluseandapenalizediterativelyreweightedleastsquares(PIRLS)algorithmfordeterminingtheconditionalmodes~()and~().Wethendescribeseveraltypesofmixedmodelsandtheformofthediscrepancyfunc-tionforeach.4.1AweightedresidualsumofsquaresAsshowninx3.1,thediscrepancyfunctionforalinearmixedmodelhastheformofapenalizedresidualsumofsquaresfromalinearmodel(11).Inthissectionwegeneralizethatdenitiontod(jy;;)=\r\r1=2()y YjU(;;)\r\r2+0 2:(32)whereisannndiagonalmatrix,calledtheweightsmatrix,withpositivediagonalelementsand1=2isthediagonalmatrixwiththesquarerootsoftheweightsonthediagonal.Thethweightisinverselyproportionaltotheconditionalvariancesof(YjU=)andmaydependontheconditionalmean,YjU.12 Weallowtheconditionalmeantobeanonlinearfunctionofthelinearpredictor,butwithcertainrestrictions.WerequirethatthemappingfromtoYjU=ubeexpressedas(33)where=Z()+Xisanns-dimensionalvector(s0)whileandaren-dimensionalvectors.Themaphasthepropertythatidependsonlyoni,=1;:::;n.Themaphasasimilarpropertyinthat,ifwewriteasannsmatrix suchthat=vec (34)(i.e.concatenatingthecolumnsof produces)thenidependsonlyonthethrowof ,=1;:::;n.ThustheJacobianmatrixd d0isannndiagonalmatrixandtheJacobianmatrixd d\r0isthehorizontalconcatenationofsdiagonalnnmatrices.Forhistoricalreasons,thefunctionthatmapsitoiiscalledtheinversefunctionandiswritten=g 1().Thelinkfunction,naturally,is=g().Whenappliedcomponent-wisetovectorsorwewritetheseas=g()and=g 1().Recallthattheconditionaldistribution,(YijU=),isrequiredtobeindependentof(YjjU=)fori;j=1;:::;n;i=jandthatallthecom-ponentconditionaldistributionsmustbeofthesameformanddieronlyaccordingtothevalueoftheconditionalmean.Dependingonthefamilyoftheconditionaldistributions,theallowablevaluesoftheimaybeinarestrictedrange.Forexample,iftheconditionaldistributionsareBernoullithen0i1;i=1;:::;n.IftheconditionaldistributionsarePoissonthen0i;i=1;:::;n.Acharacteristicofthelinkfunction,g,isthatitmustmaptherestrictedrangetoanunrestrictedrange.Thatis,alinkfunctionfortheBernoullidistributionmustmap[0;1]to[ 1;]andmustbeinvertiblewithintherange.Themappingfromtoisdenedbyafunctionm:RsR,calledthenonlinearmodelfunction,suchthati=m(i);i=1;:::;nwhereiisthethrowof .Thevector-valuedfunctionis=m().Determiningtheconditionalmodes,~(yj),and~(yj),thatjointlymin-imizethediscrepancy,~(yj)~(yj)=argminu;(y )0(y )+2(35)13 becomesaweighted,nonlinearleastsquaresproblemexceptthattheweights,,candependonand,hence,onand.Indescribinganalgorithmforlinearmixedmodelswecalled~()theconditionalestimate.Thatnamere\rectsthatfactthatthisisthemaximumlikelihoodestimateofforthatparticularvalueof.Oncewehavedeter-minedtheMLE,b()Lof,wehavea\plug-in"estimator,bL=~()for.Thispropertydoesnotcarryoverexactlytootherformsofmixedmodels.Thevalues~()and~()areconditionalmodesinthesensethattheyarethecoecientsinthatjointlymaximizetheunscaledconditionaldensityh(jy;;;).Hereweareusingtheadjective\conditional"moreinthesenseofconditioningonY=ythaninthesenseofconditioningon,althoughthesevaluesaredeterminedforaxedvalueof.4.2ThePIRLSalgorithmfor~and~Thepenalized,iterativelyreweighted,leastsquares(PIRLS)algorithmtodetermine~()and~()isaformoftheFisherscoringalgorithm.Wextheweightsmatrix,,andusepenalized,weighted,nonlinearleastsquarestominimizethepenalized,weightedresidualsumofsquaresconditionalontheseweights.Thenweupdatetheweightstothosedeterminedbythecurrentvalueofanditerate.Todescribethisalgorithminmoredetailwewilluseparenthesizedsuper-scriptstodenotetheiterationnumber.Thus(0)and(0)aretheinitialval-uesoftheseparameters,while(i)and(i)arethevaluesatthethiteration.Similarly(i)=Z()(i)+X(i),(i)=m((i))and(i)=g 1((i)).WeuseapenalizedversionoftheGauss-Newtonalgorithm(BatesandWatts,1988,ch.2)forwhichwedenetheweightedJacobianmatricesU(i)=1=2d d0u=u(i);=(i)=1=2d d0(i)d d0\r(i)Z()(36)V(i)=1=2d d0u=u(i);=(i)=1=2d d0(i)d d0\r(i)X(37)ofdimensionnqandnp,respectively.Theincrementsatthethiteration,(i)uand(i),arethesolutionstoU(i)0U(i)+IqU(i)0V(i)V(i)0U(i)V(i)0V(i)"(i)u(i)#=U(i)01=2(y (i)) (i)V(i)01=2(y (i))(38)14 providingtheupdatedparametervalues(i+1)(i+1)=(i)(i)+"(i)u(i)#(39)where0isastepfactorchosentoensurethat(y (i+1))0(y (i+1))+(i+1)2(y (i))0(y (i))+(i)2:(40)Intheprocessofsolvingfortheincrementsweformthesparse,lowertriangular,Choleskyfactor,L(i),satisfyingL(i)L(i)0=PZU(i)0U(i)+InP0Z:(41)Aftereachsuccessfuliteration,determiningnewvaluesofthecoecients,(i+1)and(i+1),thatreducethepenalized,weightedresidualsumofsquqres,weupdatetheweightsmatrixto((i+1))andtheweightedJacobians,U(i+1)andV(i+1),theniterate.Convergenceisdeterminedaccordingtotheorthogonalityconvergencecriterion(BatesandWatts,1988,ch.2),suitablyadjustedfortheweightsmatrixandthepenalty.4.3WeightedlinearmixedmodelsOneofthesimplestgeneralizationsoflinearmixedmodelsisaweightedlinearmixedmodelwheres=1,thelinkfunction,g,andthenonlinearmodelfunction,m,areboththeidentity,theweightsmatrix,,isconstantandtheconditionaldistributionfamilyisGaussian.Thatis,theconditionaldistributioncanbewritten(YjU=)N((;;);2 1)(42)withdiscrepancyfunctiond(jy;;)=\r\r1=2(y Z() X)\r\r2+2:(43)Theconditionalmode,~(),andtheconditionalestimate,~(),arethesolutionsto0()Z0WZ()+Iq0()Z0WXX0WZ()X0WX~()~()=0()Z0WyX0Wy;(44)15 whichcanbesolveddirectly,andtheCholeskyfactor,LZ(),satisesLZ()LZ()0=PZ(0()Z0WZ()+Iq)P0Z:(45)Theproledlog-likelihood,~`(jy),is 2~`(jy)=logjLZ()j2 jj+n 1+log 2~d(y;) n!!:(46)Ifthematrixisxedthenwecanignorethetermjjin(46)whendeterminingtheMLE,bL.However,insomemodels,weuseaparameterizedweightmatrix,(),andwishtodeterminetheMLEs,bLandbLsimul-taneously.Inthesecaseswemustincludetheterminvolvingj()jwhenevaluatingtheproledlog-likelihood.Notethatwemustdenetheparameterizationof()suchthat2andarenotaredundantparameterizationof2().Forexample,wecouldrequirethattherstdiagonalelementofbeunity.4.4NonlinearmixedmodelsInanunweighted,nonlinearmixedmodeltheconditionaldistributionisGaussian,thelink,g,istheidentityandtheweightsmatrix,=In.Thatis,(YjU=)N(m();2In)(47)withdiscrepancyfunctiond(jy;;)=y 2+2:(48)Foragivenvalueofwedeterminetheconditionalmodes,~()and~(),asthesolutiontothepenalizednonlinearleastsquaresproblem~()~()=argminu;d(jy;;)(49)andwewritetheminimumdiscrepancy,givenyand,as~d(y;)=d(~()jy;;~()):(50)16 Let~LZ()and~LX()betheCholeskyfactorsat,~()and~().ThentheLaplaceapproximationtothelog-likelihoodis 2`P(;;jy)nlog(22)+log(j~LZj2)+~d(y;)+\r\r\r~L0X( ~)\r\r\r2 2;(51)producingtheapproximateproledlog-likelihood,~`P(jy), 2~`P(jy)log(j~LZj2)+n1+log(2~d(y;)=n):(52)4.4.1NonlinearmixedmodelsummaryInanonlinearmixedmodelwedeterminetheparameterestimate,bP,fromtheLaplaceapproximationtothelog-likelihoodasbP=argmax~`P(jy)=argminlog(j~LZj2)+n1+log(2~d(y;)=n):(53)Eachevaluationof~`P(jy)requiresasolvingthepenalizednonlinearleastsquaresproblem(49)simultaneouslywithrespecttobothsetsofcoecients,and,inthelinearpredictor,.Foraweightednonlinearmixedmodelwithxedweights,,wereplacetheunweighteddiscrepancyfunctiond(jy;;)withtheweighteddiscrep-ancyfunction,5Detailsoftheimplementation5.1ImplementationdetailsforlinearmixedmodelsThecrucialstepinimplementingalgorithmsfordeterminingMLorREMLestimatesoftheparametersinalinearmixedmodelisevaluatingthefac-torization(15)foranysatisfyingLU.WewillassumethatZissparseasisZ().WhenXisnotsparsewewillusethefactorization(15)settingPX=IpandstoringLXZandLXasdensematrices.ThepermutationmatrixPZisdeterminedfromthepatternofnon-zerosinZ()whichisdoesnotdependon,aslongasisnotontheboundary.Infact,inmostcasesthepatternofnon-zerosinZ()isthesameasthepatternofnon-zerosinZ.17 Formanymodels,inparticularmodelswithscalarrandomeects(describedlater),thematrix()isdiagonal.GivenavalueofwedeterminetheCholeskyfactorLZsatisfyingLZL0Z=PZ(0()Z0Z()+Iq)P0Z:(54)TheCHOLMODpackageallowsforLZtobecalculateddirectlyfrom0()Z0orfrom0()Z0Z().ThechoiceinimplementationiswhethertostoreZ0andupdateitto0()ZortostoreZ0Zanduseittoform0()Z0Z()ateachevaluation.Inthelme4packagewestoreZ0anduseittoform0()Z0fromwhichLZisevaluated.Therearetworeasonsforthischoice.First,thecalculationsforthemoregeneralformsofmixedmodelscannotbereducedtocalculationsinvolvingZ0Zandbyexpressingthesecalculationsintermsof()Z0forlinearmixedmodelswecanreusethecodeforthemoregeneralmodels.Sec-ond,thecalculationof()0(Z0Z)()fromZ0Ziscomplicatedcomparedtothecalculationof()0Z0fromZ0.ThischoiceisdisadvantageouswhennqbecauseZ0ismuchlargerthanZ0Z,evenwhentheyarestoredassparsematrices.EvaluationofLZdirectlyfromZ0requiresmorestorageandmorecalculationthatevaluatingLZfromZ0Z.NextweevaluateL0XZasthesolutiontoLZL0XZ=PZ0()Z0X:(55)AgainwehavethechoiceofcalculatingandstoringZ0XorstoringXandusingittoreevaluateZ0X.Inthelme4packagewestoreX,becausethecalculationsforthemoregeneralmodelscannotbeexpressedintermsofZ0X.FinallyLXisevaluatedasthe(dense)solutiontoLXL0X=X0X LXZLXZ:(56)fromwhich~canbedeterminedasthesolutiontodensesystemLXLX~=X0y(57)and~asthesolutiontothesparsesystemLZLZ~u=0Z0y(58)18 Formanymodels,inparticularmodelswithscalarrandomeects,whicharedescribedlater,thematrix()isdiagonal.Forsuchamodel,ifbothZandXaresparseandweplantousetheREMLcriterionthenwecreateandstoreA=Z0ZZ0XX0ZX0Xandc=Z0yX0y(59)anddetermineall-reducingpermutation,P,forA.GivenavalueofwecreatethefactorizationL()L()0=P()00IpA()00Ip+Iq000P0(60)solvefor~()and~()inLL0P~()~()=P()00Ipc(61)thenevaluate~d(yj)andtheproledREMLcriterionas~dR(jy)=log(jL()j2)+(n p) 1+log 2~d(yj) n p!!:(62)ReferencesDouglasM.BatesandDonaldG.Watts.NonlinearRegressionAnalysisandItsApplications.Wiley,1988.TimDavis.CHOLMOD:sparsesupernodalCholeskyfactorizationandup-date/downdate.http://www.cise.u\r.edu/research/sparse/cholmod,2005.TimothyA.Davis.DirectMethodsforSparseLinearSystems.FundamentalsofAlgorithms.SIAM,2006.ANotationA.1RandomvariablesinthemodelBRandom-eectsvectorofdimensionq,BN(0;2()()0).19 U\Spherical"random-eectsvectorofdimensionq,UN(0;2Iq),B=()U.YResponsevectorofdimensionn.A.2ParametersofthemodelFixed-eectsparameters(dimensionp).Parametersdeterminingtheleftfactor,()oftherelativecovariancematrixofB(dimensionm).thecommonscaleparameter-notusedinsomegeneralizedlinearmixedmodelsandgeneralizednonlinearmixedmodels.A.3Dimensionsmdimensionoftheparameter.ndimensionoftheresponsevector,y,andtherandomvariable,Y.pdimensionofthexed-eectsparameter,.qdimensionoftherandomeects,BorU.sdimensionoftheparametervector,,inthenonlinearmodelfunction.A.4MatricesLLeftCholeskyfactorofapositive-denitesymmetricmatrix.LZisqq;LXispp.PFill-reducingpermutationfortherandomeectsmodelmatrix.(Sizeqq.)Leftfactoroftherelativecovariancematrixoftherandomeects.(Sizeqq.)XModelmatrixforthexed-eectsparameters,.(Size(ns)p.)ZModelmatrixfortherandomeects.(Size(ns)q.)20 BIntegratingaquadraticdevianceexpres-sionIn(6)wedenedthelikelihoodoftheparametersgiventheobserveddataasL(;;jy)=ZRqh(jy;;;)d:whichisoftenalarminglydescribedas\anintractableintegral".Inpointoffact,thisintegralcanbeevaluatedexactlyinthecaseofalinearmixedmodelandcanbeapproximatedquiteaccuratelyforotherformsofmixedmodels.21