Point Estimation properties of estimators nitesample properties CB - PDF document

563 views
Uploaded On 2014-12-24

Point Estimation properties of estimators nitesample properties CB - PPT Presentation

3 largesample properties CB 101 1 FINITESAMPLE PROPERTIES How an estimator performs for 64257nite number of observations Estimator Parameter Criteria for evaluating estimators Bias does EW Variance of you would like an estimator with a smaller varia ID: 28797

largesample properties 101

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/28797" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Pdf The PPT/PDF document "Point Estimation properties of estimator..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

PointEstimation:propertiesofestimatorsnite-sampleproperties(CB7.3)large-sampleproperties(CB10.1)1FINITE-SAMPLEPROPERTIESHowanestimatorperformsfornitenumberofobservationsn.Estimator:WParameter:Criteriaforevaluatingestimators:Bias:doesEW=?VarianceofW(youwouldlikeanestimatorwithasmallervariance)Example:X1;:::;Xni.i.d.(;2)Unknownparametersareand2.Consider:^n1 nPiXi,estimatorof^2n1 nPi(Xi�Xn)2,estimatorof2.Bias:E^n=1 nn=.Sounbiased.Var^n=1 n2n2=1 n2.E^2=E(1 nXi(Xi�Xn)2)=1 nXi(EX2i�2EXiXn+EX2n)=1 nn(2+2)�2(2+2 n)+2 n+2=n�1 n2:1 Henceitisbiased.Toxthisbias,considertheestimators2n1 n�1Pi(Xi�Xn)2,andEs2n=2(unbiased).Mean-squarederror(MSE)ofWisE(W�)2.Commoncriterionforcomparingestimators.Decompose:MSE(W)=VW+(EW�)2=Variance+(Bias)2.Hence,foranunbiasedestimator:MSE(W)=VW.Example:X1;:::;XnU[0;].f(X)=1=;x2[0;].Considerestimator^n2Xn.E^n=21 nEPiXi=21 n1 2n=.SounbiasedMSE(^N)=V^n=4 n2PiVXi=2 3nConsiderestimator~nmax(X1;:::;Xn).Inordertoderivemoments,startbyderivingCDF:P(~nz)=P(X1z;X2z;:::;Xnz)=nYi=1P(Xiz)=�z nifz1otherwiseThereforef~n(z)=n�z n�11 ,for0x.E(~n)=Z0znz n�11 dz=n nZ0zndz=n n+1:Bias(~n)=�=(n+1)E(~2n)=n nR0zn+1dz=n n+22.HenceV~n=2n n+2��n n+12=2n (n+2)(n+1)2.Accordingly,MSE=22 (n+2)(n+1)2 Continuethepreviousexample.Redene~n=n+1 nmax(X1;:::;Xn).Nowbothestimators^nand~nareunbiased.Whichisbetter?V^n=2 3n=O(1=n).V~n=�n+1 n2V(max(X1;:::;Xn))=21 n(n+2)=O(1=n2).Hence,fornlargeenough,~nhasasmallervariance,andinthissenseitis“better”.Bestunbiasedestimator:ifyouchoosethebest(intermsofMSE)estimator,andrestrictyourselftounbiasedestimators,thenthebestestimatoristheonewiththelowestvariance.Abestunbiasedestimatorisalsocalledthe“Uniformminimumvarianceunbiasedestimator”(UMVUE).Formally:anestimatorWisaUMVUEofsatises:(i)EW=,forall(unbiasedness)(ii)VWV~W,forall,andallotherunbiasedestimators~W.The“uniform”conditioniscrucial,becauseitisalwayspossibletondestimatorswhichhavezerovarianceforaspecicvalueof.ItisdifcultingeneraltoverifythatanestimatorWisUMVUE,sinceyouhavetoverifycondition(ii)ofthedenition,thatVWissmallerthanallotherunbiasedestimators.Luckily,wehaveanimportantresultforthelowestattainablevarianceofanesti-mator.Theorem7.3.9(Cramer-RaoInequality):LetX1;:::;Xnbeasamplewithjointpdff(~Xj),andletW(~X)beanyestimatorsatisfying(i)d dEW(~X)=Z@ @hW(~X)f(~Xj)id~X;(ii)VW(~X)1:ThenVW(~X)d dEW(~X)2 E@ @logf(~Xj)2:3 TheRHSaboveiscalledtheCramer-RaoLowerBound.Proof:CB,pg.336.Inshort,weapplytheCauchy-SchwarzinequalityV(S)cov(S;T)=V(T)withS=W(~X)andT=@ @logf(~Xj).ThechoiceofTheremayseemabitarbitrary.Togetsomeintuition,considerCramer'sderivation.1StartwiththefollowingmanipulationoftheequalityEW(X)=RW(X)f(Xj)dX:d dEW(X)=d dZW(X)f(Xj)dX=ZW(X)@ @f(Xj)dX=Z(W(X)�EW(X))@ @f(Xj)dX(noteZEW(X)@ @f(Xj)dX=0)=Z(W(X)�EW(X))@ @logf(Xj)f(Xj)dXApplyingtheCauchy-Schwarzinequality,wehave[d dEW(X)]2VarW(X)E[@ @logf(Xj)]2orVarW(X)[d dEW(X)]2 E[@ @logf(Xj)]2:TheLHSofcondition(i)aboveisd dRW(~X)f(Xj)dX,sobyLeibniz'rule,thisconditionrulesoutcaseswherethesupportofXisdependenton.ThecrucialstepinthederivationoftheCR-boundistheinterchangeofdif-ferentiationandintegrationwhichimpliesE@ @logf(~Xj)=Z1 f(~Xj)@f(~Xj) @f(~Xj)dx=@ @Zf(~Xj)dx=@ @1=0(1) 1Cramer,MathematicalMethodsofStatistics,p.475ff.4 (skip)Theabovederivationisnoteworthy,because@ @logf(~Xj)=0istheFOCofmaximumlikelihoodestimationproblem.Inthei.i.d.case,thisbecomesthesampleaverage1 nXi@ @logf(xij)=0:AndbytheLLN:1 nXi@ @logf(xij)p!E0@ @logf(xij);where0isthetruevalueof0.Thisshowsthatmaximumlikelihoodestima-tionofisequivalenttoestimationbasedonthemomentconditionE0@ @logf(xij)=0whichholdsonlyatthetruevalue=0.(ThusMLEis“consistent”forthetruevalue0,aswe'llseelater.)(However,notethatEq.(1)holdsatallvaluesof,notjust0.)[Thinkabout]Whatifmodelis“misspecied”,inthesensethattruedensityof~Xisg(~x),andthatforall2,f(~xj)6=g(~x)(thatis,thereisnovalueoftheparametersuchthatthepostulatedmodelfcoincideswiththetruemodelg)?DoesEq.(1)stillhold?WhatisMLElookingfor?Intheiidcase,theCRlowerboundcanbesimpliedCorollary7.3.10:ifX1;:::;Xni.i.d.f(Xj),thenVW(~X)d dEW(~X)2 nE�@ @logf(Xj)2:(2)Uptothispoint,Cramer-Raoresultsnotthatoperationalforustonda“best”estimator,becausetheestimatorW(~X)isonbothsidesoftheinequality.However,foranunbiasedestimator,EW(~X)=,sothatd dEW(~X)=1.5 Example:X1;:::;Xni.i.d.N(;2).WhatisCRLBforanunbiasedestimatorof?Unbiased!numerator=1.logf(xj)=logp 2�log�1 2x� 2@ @logf(xj)=�x� �1 =x� 2E@ @logf(Xj)2=E�(X�)24=1 4VX=1 2:HencetheCRLB=1 n1 2=2 n.Thisisthevarianceofthesamplemean,sothatthesamplemeanisaUMVUEfor.SometimeswecansimplifythedenominatoroftheCRLBfurther:Lemma7.3.11(Informationinequality):iff(Xj)satises(*)d dE@ @logf(Xj)=Z@ @@ @logf(Xj)f(Xj)dx;thenE@ @logf(Xj)2=�E@2 @2logf(Xj):Proof:LHSof(*):UsingEq.(1)above,wegetthatLHSof(*)=0.RHSof(*):Z@ @@ @logffdx=Z@2logf @2fdx+Z1 f@f @2dx=E@2logf @2+E@logf @2:PuttingtheLHSandRHStogetheryieldsthedesiredresult.6 TheLHSoftheabovecondition(*)isjustd dR�@ @logf(Xj)f(Xj)dX.Asbefore,thecrucialstepistheinterchangeofdifferentiationandintegration.(skip)Also,theinformationinequalitydependscruciallyontheequalityE@ @logf(Xj)=0,whichdependsonthecorrectspecicationofthemodel.Example:forthepreviousexample,considerCRLBforunbiasedestimatorof2.Wecanusetheinformationinequality,becausecondition(*)issatisedforthenormaldistribution.Hence:@ @2logf(xj)=�1 22+1 2(x�)2 4@ @2@ @2logf(xj)=1 24�(x�)2 6�E@ @2@ @2logf(xj)=�1 24�1 4=1 24:HencetheCRLBis24 n.Example:X1;:::;XnU[0;].CheckconditionsforCRLBforanunbiasedestimatorW(~X)of.d dEW(~X)=1(becauseitisunbiased)Z@ @hW(~X)f(~Xj)id~X=ZW(~X)�1 2d~X6=d dEW(~X)=1Hence,condition(i)oftheoremnotsatised.LossfunctionoptimalityLet~Xf(~Xj).ConsideralossfunctionL(;W(~X)),takingvaluesin[0;+1),whichpenalizesyouwhenyourW(~X)estimatoris“far”fromthetrueparameter.NotethatL(;W(~X))isarandomvariable,since~X(andW(~X))arerandom.Considerestimatorswhichminimizeexpectedloss:thatisminW()EL(;W(~X))minW()R(;W())7 whereR(;W())istheriskfunction.(Note:theriskfunctionisnotarandomvariable,because~Xhasbeenintegratedout.)LossfunctionoptimalityisamoregeneralcriterionthanminimumMSE.Infact,becauseMSE(W(~X))=EW(~X)�2,theMSEisactuallytheriskfunctionassociatedwiththequadraticlossfunctionL(;W(~X))=W(~X)�2.Otherexamplesoflossfunctions:Absoluteerrorloss:jW(~X)�jRelativequadraticerrorloss:(W(~X)�)2 jj+1Theexerciseofminimizingrisktakesagivenvalueofasgiven,sothatthemin-imizedriskofanestimatordependsonwhichevervalueofyouareconsidering.Youmightbeinterestedinanestimatorwhichdoeswellregardlessofwhichvalueofyouareconsidering.(Analogoustothefocusontheuniformminimalvariance.)Forthisdifferentproblem,youwanttoconsideranotionofriskwhichdoesnotdependon.Twopossiblecriteriaare:“Average”risk:minW()ZR(;W())h()d:whereh()issomeweightingfunctionacross.(InaBayesianinterpretation,h()isapriordensityover.)Minmaxcriterion:minW()maxR(;W()):HereyouchoosetheestimatorW()tominimizethemaximumrisk=maxR(;W()),whereissettothe“worse”value.Sominmaxoptimizeristhebestthatcanbeachievedina“worst-case”scenario.8 2LARGESAMPLEPROPERTIESOFESTIMATORSLarge-sampleproperties:exploitLLN,CLTConsiderdatafX1;X2;:::gbywhichweconstructasequenceofestimatorsWnnW(~X1);W(~X2);:::o.Wnisarandomsequence.Dene:wesaythatWnisconsistentforaparameterifftherandomsequenceWnconverges(insomestochasticsense)to.StrongconsistencyobtainswhenWnas!.WeakconsistencyobtainswhenWnp!.Forestimatorslikesample-means,consistency(eitherweakorstrong)followseas-ilyusingaLLN.Dene:anM-estimatorisanestimatorofwhichamaximizerofanobjectivefunctionQn().Examples:MLE:Qn()=1 nPni=1logf(xij)Leastsquares:Qn()=Pni=1[yi�g(xi;)]2.OLSisspecialcasewheng(xi;)=+X0i.GMM:Qn()=Gn()0Wn()Gn()whereGn()="1 nnXi=1m1(xi;);1 nnXi=1m2(xi;);:::;1 nnXi=1mM(xi;)#0;anM1vectorofsamplemomentconditions,andWnisanMMweightingmatrix.Notation:Foreach2,letff(x1;:::;xn;:::;)denotethejointdensityofthedataforthegivenvalueof.For02,wedenotethelimitobjectivefunctionQ0()=plimn!1;f0Qn()(ateach).ConsistencyofM-estimatorsMakethefollowingassumptions:1.Foreach02,thelimitingobjectivefunctionQ0()isuniquelymaximizedat0(“identication”)9 2.ParameterspaceisacompactsubsetofRK.3.Q0()iscontinuousin4.Qn()convergesuniformlyinprobabilitytoQ0();thatis:sup2jQn()�Q0()jp!0:Theorem:(ConsistencyofM-Estimator)Underassumption1,2,3,4,np!0.Proof:Weneedtoshow:foranyarbitrarilysmallneighborhoodNcontaining0,P(n2N)!1.Fornlargeenough,theuniformconvergenceconditionsthat,forall;�0,Psup2jQn()�Q0()j=2&#x-278;1�:Theevent“sup2jQn()�Q0()j=2”impliesQn(n)�Q0(n)=2,Q0(n)&#x-278;Qn(n)�=2(3)Similarly,Qn(0)�Q0(0)&#x-278;�=2)Qn(0)&#x-278;Q0(0)�=2:(4)Sincen=argmaxQn(),Eq.(3)impliesQ0(n)&#x-278;Qn(0)�=2:(5)Hence,addingEqs.(4)and(5),wehaveQ0(n)&#x-278;Q0(0)�:(6)Sowehaveshownthatsup2jQn()�Q0()j=2=)Q0(n)&#x-278;Q0(0)�,P(Q0(n)&#x-278;Q0(0)�)Psup2jQn()�Q0()j=2!1:NowdeneNasanyopenneighborhoodofRK,whichcontains0,andNisthecomplementofNinRK.Then\Niscompact,sothatmax2\NQ0()exists.Set=Q0(0)�max2\NQ0().ThenQ0(n)&#x-278;Q0(0)�)Q0(n)&#x-278;max2\NQ0())n2N,P(n2N)P(Q0(n)&#x-278;Q0(0)�)!1:10 Sincetheargumentaboveholdsforanyarbitrarilysmallneighborhoodof0,wearedone.Ingeneral,thelimitobjectivefunctionQ0()=plimn!1Qn()maynotbethatstraightforwardtodetermine.Butinmanycases,Qn()isasampleaverageofsomesort:Qn()=1 nXiq(xij)(eg.leastsquares,MLE).Thenbyalawoflargenumbers,weconcludethat(forall)Q0()=plim1 nXiq(xij)=Exiq(xij)whereExidenoteexpectationwithrespecttothetrue(butunobserved)dis-tributionofxi.(skip)Mostofthetime,0canbeinterpretedasa“truevalue”.Butifmodelismisspecied,thenthisinterpretationdoesn'thold(indeed,undermisspec-ication,notevenclearwhatthe“truevalue”is).Soamorecautiouswaytointerprettheconsistencyresultisthatnp!argmaxQ0()whichholds(giventheconditions)nomatterwhethermodeliscorrectlyspec-ied.**Let'sunpacktheuniformconvergencecondition.Sufcientconditionsforthisconditionsare:1.Pointwiseconvergence:Foreach2,Qn()�Q0()=op(1).2.Qn()isstochasticallyequicontinuous:forevery�0;�0thereexistsasequenceofrandomvariablen(;)andn(;)suchthatforalln�n,P(jnj�)andforeachthereisanopensetNcontainingwithsup~2NjQn(~)�Qn()jn;8n&#x-278;n:Notethatbothnandndonotdependon:itisuniformresult.Thisisan“inprobability”versionofthedeterministicnotionofuniformequicontinuity:wesayasequenceofdeterministicfunctionsRn()isuniformlyequicontinuousif,forevery&#x-278;0thereexists()andn()suchthatforallsup~:jj~�jjjRn(~)�Rn()j;8n&#x]TJ/;༙ ;.9;Ւ ;&#xTf 1;.13; 3.;3 T; [0;n:11 Tounderstandthismoreintuitively,considerwhatweneedforconsistency.BycontinuityofQ0,weknowthatQ0()isclosetoQ0(0)for2N(0).Bypointwiseconvergence,wehaveQn()convergingtoQ0()forall.However,whatweneedisthatevenifQn()isnotoptimizedby0,theoptimizern=argmaxQn()shouldnotbefarfrom0.Pointwiseconvergencedoesnotguaranteethis.Forthelastpart,weneedQn()tobe“equallyclose”toQ0()forall,be-causethentheoptimizersofQnandQ0cannotbetoofarapart.However,pointwiseconvergenceisnotenoughtoensurethis“equallycloseness”.Atanygivenn,Qn(0)beingclosetoQ0(0)doesnotimplythisatotherpoints.Uniformconvergenceensures,roughlyspeaking,thatatanygivenn,QnandQ0are“equallyclose”atallpoints.Thiswasexploitedintheproof.AsymptoticnormalityforM-estimatorsDenethe“scorevector”5~Qn()=@Qn() @1=~;:::;@Qn() @K=~0:Similarly,denetheKKHessianmatrix5~~Qn()i;j=@2Qn() @i@j=~;1i;jK:NotethattheHessianissymmetric.Makethefollowingassumptions:1.n=argmaxQn()p!02.02interior()3.Qn()istwicecontinuouslydifferentiableinaneighborhoodNof0.4.p n50Qn()d!N(0;)5.UniformconvergenceofHessian:thereexiststhematrixH()whichiscon-tinuousat0andsup2Njj5Qn()�H()jjp!0.6.H(0)isnonsingular12 Theorem(AsymptoticnormalityforM-estimator):Underassumptions1,2,3,4,5,p n(n�0)d!N(0;H�10H�10)whereH0H(0).Proof:(sketch)ByAssumptions1,2,3,5nQn()=0(thisisFOCofmaximizationproblem).Thenusingmean-valuetheorem(withndenotingmeanvalue):0=5nQn()=50Qn()+5nnQn()(n�0))5nnQn()| {z }p!H0(usingA5)p n(n�0)=�p n50Qn()| {z }d!N(0;)(usingA4),p n(n�0)d!�H(0)�1N(0;)=N(0;H�10H�10):Note:A5isauniformconvergenceassumptiononthesampleHessian.Givenpreviousdiscussion,itensuresthatthesampleHessian5Qn()evaluatedatn(whichiscloseto0)doesnotvaryfarfromthelimitHessianH()at0,whichisimpliedbyatypeof“continuity”ofthesampleHessiancloseto0.2.1MaximumlikelihoodestimationTheconsistencyofMLEcanfollowbyapplicationofthetheoremaboveforconsis-tencyofM-estimators.Essentially,aswenotedabove,whattheconsistencytheoremshowedabovewasthat,foranyM-estimatorsequencen:plimn!1n=argmaxQ0():ForMLE,thereisadistinctandearlierargumentduetoWald(1949),whoshowsthat,inthei.i.d.case,the“limitinglikelihoodfunction”(correspondingtoQ0())isindeedgloballymaximizedat0,the“truevalue”.Thus,wecandirectlyconrmtheidenticationassumptionoftheM-estimatorconsistencytheorem.Thisargumentisofinterestbyitself.Argument:(summaryofAmemiya,pp.141–142)Dene^MLEnargmax1 nPilogf(xij).Let0denotethetruevalue.13 ByLLN:1 nPilogf(xij)p!E0logf(xij);forall(notnecessarilythetrue0).ByJensen'sinequality:E0logf(xj) f(xj0)logE0f(xj) f(xj0)ButE0f(xj) f(xj0)=Rf(xj) f(xj0)f(xj0)=1,sincef(xj)isadensityfunction,forall.2Hence:E0logf(xj) f(xj0)0;8=)E0logf(xj)E0logf(xj0);8=)E0logf(xj)ismaximizedatthetrue0:Thisisthe“identication”assumptionfromtheM-estimatorconsistencythe-orem.(skip)Analogously,wealsoknowthat,for&#x-278;0,1=E0logf(x;0�) f(x;0)0;2=E0logf(x;0+) f(x;0)0:BytheSLLN,weknowthat1 nXilogf(xi;0�) f(xi;0)=1 n[logLn(~x;0�)�logLn(~x:0)]as!1sothat,withprobability1,logLn(~x;0�)logLn(~x;0)fornlargeenough.Similarly,fornlargeenough,logLn(~x;0+)logLn(~x;0)withprobability1.Hence,forlargen,^nargmaxlogLn(~x;)2(0�;0+):Thatis,theMLE^nisstronglyconsistentfor0.NotethatthisargumentrequiresweakerassumptionsthantheM-estimatorconsistencytheoremabove. 2Inthisstep,notetheimportanceofassumptionA3inCB,pg.516.Ifxhassupportdependingon,thenitwillnotintegrateto1forall.14 Nowweintroduceanotheridea,efciency,whichisalarge-sampleanalogueofthe“minimumvariance”concept.ForthesequenceofestimatorsWn,supposethatk(n)(Wn�)d!N(0;2)wherek(n)isapolynomialinn.Then2isdenotedtheasymptoticvarianceofWn.In“usual”cases,k(n)=p n.Forexample,bytheCLT,weknowthatp n(Xn�)d!N(0;2).Hence,2istheasymptoticvarianceofthesamplemeanXn.Denition10.1.11:AnestimatorsequenceWnisasymptoticallyefcientforifp n(Wn�)d!N(0;v()),wheretheasymptoticvariancev()=1 E0(@ @logf(Xj))2BycomparisonwithEq.(2),notethattheasymptoticvariance1 E0(@ @logf(Xj))2isequivalenttotheCRLBforoneobservation(n=1).Afulldiscussionandjusti-cationofefciencyisdeepandbeyondthiscourse.ButrecallthatN(0;1=I()),whereI()E0�@ @logf(Xj)2denotestheFisherinformation,isthedistribu-tionforthesamplemeanestimatorforthemeanparameterofanormaldistributionusingonlyoneobservation.Soessentiallyasyptoticallyefcientestimatorsareasymptoticallyequivalenttosuchanestimationproblem.ByasymptoticnormalityresultforM-estimator,weknowwhattheasymptoticdistri-butionfortheMLEshouldbe.However,itturnsoutgiventheinformationinequality,theMLE'sasymptoticdistributioncanbefurthersimplied.Theorem10.1.12:AsymptoticefciencyofMLEProof:(followingAmemiya,pp.143–144)^MLEnsatisestheFOCoftheMLEproblem:0=@logL(j~Xn) @=^MLEn:15 Usingthemeanvaluetheorem:0=@logL(j~Xn) @=0+@2logL(j~Xn) @2=n^MLEn�0=)p n^n�0=p n�@logL(j~Xn) @=0 @2logL(j~Xn) @2=n=p n�1 nPi@logf(xij) @=0 1 nPi@2logf(xij) @2=n()Notethat,bytheLLN,1 nXi@logf(xij) @=0p!E0@logf(Xj) @=0=Z@f(xij) @=0dx:Usingsameargumentasintheinformationinequalityresultabove,thelasttermis:Z@f @dx=@ @Zfdx=0:Hence,theCLTcanbeappliedtothenumeratorof(**):numeratorof(**)d!N 0;E0@logf(xij) @=02!:ByLLN,anduniformconvergenceofHessianterm:denominatorof(**)p!E0@2logf(Xj) @2=0:Hence,bySlutskytheorem:p n^n�0d!N0B@0;E0@logf(xij) @=02 hE0@2logf(Xj) @2=0i21CA:Bytheinformationinequality:E0@logf(xij) @=02=�E0@2logf(Xj) @2=0sothatp n^n�0d!N0B@0;1 E0@logf(xij) @=021CA16 sothattheasymptoticvarianceistheCRLB.Hence,theasymptoticapproximationforthenite-sampledistributionis^MLEnaN0B@0;1 n1 E0@logf(xij) @=021CA:17

Point Estimation properties of estimators nitesample properties CB - PDF document

Point Estimation properties of estimators nitesample properties CB - PPT Presentation

Share:

Link:

Embed:

Related Contents