/
2regresspostestimation—Postestimationtoolsforregress 2regresspostestimation—Postestimationtoolsforregress

2regresspostestimation—Postestimationtoolsforregress - PDF document

pasty-toler
pasty-toler . @pasty-toler
Follow
384 views
Uploaded On 2015-10-08

2regresspostestimation—Postestimationtoolsforregress - PPT Presentation

ThefollowingstandardpostestimationcommandsarealsoavailableCommandDescription contrastcontrastsandANOVAstylejointtestsofestimatesestaticAkaikesandSchwarzsBayesianinformationcriteriaAICandBICestat ID: 153438

Thefollowingstandardpostestimationcommandsarealsoavailable:CommandDescription contrastcontrastsandANOVA-stylejointtestsofestimatesestaticAkaike'sandSchwarz'sBayesianinformationcriteria(AICandBIC)estat

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "2regresspostestimation—Postestimati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

2regresspostestimation—Postestimationtoolsforregress Thefollowingstandardpostestimationcommandsarealsoavailable:CommandDescription contrastcontrastsandANOVA-stylejointtestsofestimatesestaticAkaike'sandSchwarz'sBayesianinformationcriteria(AICandBIC)estatsummarizesummarystatisticsfortheestimationsampleestatvcevariance–covariancematrixoftheestimators(VCE)estat(svy)postestimationstatisticsforsurveydataestimatescatalogingestimationresultsforecast1dynamicforecastsandsimulationshausmanHausman'sspecicationtestlincompointestimates,standarderrors,testing,andinferenceforlinearcombinationsofcoefcientslinktestlinktestformodelspecicationlrtest2likelihood-ratiotestmarginsmarginalmeans,predictivemargins,marginaleffects,andaveragemarginaleffectsmarginsplotgraphtheresultsfrommargins(proleplots,interactionplots,etc.)nlcompointestimates,standarderrors,testing,andinferencefornonlinearcombinationsofcoefcientspredictpredictions,residuals,inuencestatistics,andotherdiagnosticmeasurespredictnlpointestimates,standarderrors,testing,andinferenceforgeneralizedpredictionspwcomparepairwisecomparisonsofestimatessuestseeminglyunrelatedestimationtestWaldtestsofsimpleandcompositelinearhypothesestestnlWaldtestsofnonlinearhypotheses 1forecastisnotappropriatewithmiorsvyestimationresults.2lrtestisnotappropriatewithsvyestimationresults. 4regresspostestimation—Postestimationtoolsforregress rstandardcalculatesthestandardizedresiduals.rstudentcalculatestheStudentized(jackknifed)residuals.cooksdcalculatestheCook'sDinuencestatistic(Cook1977).leverageorhatcalculatesthediagonalelementsoftheprojection(“hat”)matrix.pr(a,b)calculatesPr(axjb+ujb),theprobabilitythatyjjxjwouldbeobservedintheinterval(a;b).aandbmaybespeciedasnumbersorvariablenames;lbandubarevariablenames;pr(20,30)calculatesPr(20xjb+uj30);pr(lb,ub)calculatesPr(lbxjb+ujub);andpr(20,ub)calculatesPr(20xjb+ujub).amissing(a:)means�1;pr(.,30)calculatesPr(�1xjb+uj30);pr(lb,30)calculatesPr(�1xjb+uj30)inobservationsforwhichlb:andcalculatesPr(lbxjb+uj30)elsewhere.bmissing(b:)means+1;pr(20,.)calculatesPr(+1&#x]TJ/;ྒ ; .56;A T; 10;&#x.516;&#x 0 T; [0;xjb+uj&#x]TJ/;ྒ ; .56;A T; 10;&#x.516;&#x 0 T; [0;20);pr(20,ub)calculatesPr(+1&#x]TJ/;ྒ ; .56;A T; 10;&#x.516;&#x 0 T; [0;xjb+uj&#x]TJ/;ྒ ; .56;A T; 10;&#x.516;&#x 0 T; [0;20)inobservationsforwhichub:andcalculatesPr(20xjb+ujub)elsewhere.e(a,b)calculatesE(xjb+ujjaxjb+ujb),theexpectedvalueofyjjxjconditionalonyjjxjbeingintheinterval(a;b),meaningthatyjjxjistruncated.aandbarespeciedastheyareforpr().ystar(a,b)calculatesE(yj),whereyj=aifxjb+uja,yj=bifxjb+ujb,andyj=xjb+ujotherwise,meaningthatyjiscensored.aandbarespeciedastheyareforpr().dfbeta(varname)calculatestheDFBETAforvarname,thedifferencebetweentheregressioncoefcientwhenthejthobservationisincludedandexcluded,saiddifferencebeingscaledbytheestimatedstandarderrorofthecoefcient.varnamemusthavebeenincludedamongtheregressorsinthepreviouslyttedmodel.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.stdpcalculatesthestandarderroroftheprediction,whichcanbethoughtofasthestandarderrorofthepredictedexpectedvalueormeanfortheobservation'scovariatepattern.Thestandarderrorofthepredictionisalsoreferredtoasthestandarderrorofthettedvalue.stdfcalculatesthestandarderroroftheforecast,whichisthestandarderrorofthepointpredictionfor1observation.Itiscommonlyreferredtoasthestandarderrorofthefutureorforecastvalue.Byconstruction,thestandarderrorsproducedbystdfarealwayslargerthanthoseproducedbystdp;seeMethodsandformulas.stdrcalculatesthestandarderroroftheresiduals.covratiocalculatesCOVRATIO(Belsley,Kuh,andWelsch1980),ameasureoftheinuenceofthejthobservationbasedonconsideringtheeffectonthevariance–covariancematrixoftheestimates.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.dfitscalculatesDFITS(WelschandKuh1977)andattemptstosummarizetheinformationintheleverageversusresidual-squaredplotintoonestatistic.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.welschcalculatesWelschdistance(Welsch1982)andisavariationondfits.Thecalculationisautomaticallyrestrictedtotheestimationsubsample. 6regresspostestimation—Postestimationtoolsforregress Nowallthisisamostunsatisfactorystateofaffairs.Pointswithlargeresidualsmay,butneednot,havealargeeffectonourresults,andpointswithsmallresidualsmaystillhavealargeeffect.Pointswithhighleveragemay,butneednot,havealargeeffectonourresults,andpointswithlowleveragemaystillhavealargeeffect.Canyounotidentifytheinuentialpointsandsimplyhavethecomputerlistthemforyou?Youcan,butyouwillhavetodenewhatyoumeanby“inuential”.“Inuential”isdenedwithrespecttosomestatistic.Forinstance,youmightaskwhichpointsinyourdatahavealargeeffectonyourestimateda,whichpointshavealargeeffectonyourestimatedb,whichpointshavealargeeffectonyourestimatedstandarderrorofb,andsoon,butdonotbesurprisedwhentheanswerstothesequestionsaredifferent.Inanycase,obtainingsuchmeasuresisnotdifcult—allyouhavetodoisttheregressionexcludingeachobservationoneatatimeandrecordthestatisticofinterestwhich,inthedayofthemoderncomputer,isnottooonerous.Moreover,youcansaveconsiderablecomputertimebydoingalgebraaheadoftimeandworkingoutformulasthatwillcalculatethesameanswersasifyouraneachoftheregressions.(Ignorethequestionofpairsofobservationsthat,together,exertundueinuence,andtriples,andsoon,whichremainslargelyunsolvedandforwhichthebruteforcet-every-possible-regressionprocedureisnotaviablealternative.)FittedvaluesandresidualsTypingpredictnewvarwithnooptionscreatesnewvarcontainingthettedvalues.Typingpredictnewvar,residcreatesnewvarcontainingtheresiduals. Example1Continuingwithexample1from[R]regress,wewishtotthefollowingmodel:mpg= 0+ 1weight+ 2foreign+.usehttp://www.stata-press.com/data/r13/auto(1978AutomobileData).regressmpgweightforeignSource SSdfMSNumberofobs=74 F(2,71)=69.75Model 1619.28772809.643849Prob�F=0.0000Residual 824.1717617111.608053R-squared=0.6627 AdjR-squared=0.6532Total 2443.459467333.4720474RootMSE=3.4071 mpg Coef.Std.Err.t�P|t|[95%Conf.Interval] weight -.0065879.0006371-10.340.000-.0078583-.0053175foreign -1.6500291.075994-1.530.130-3.7955.4954422_cons 41.67972.16554719.250.00037.3617245.99768 Thatdone,wecannowobtainthepredictedvaluesfromtheregression.Wewillstoretheminanewvariablecalledpmpgbytypingpredictpmpg.Becausepredictproducesnooutput,wewillfollowthatbysummarizingourpredictedandobservedvalues. 10regresspostestimation—Postestimationtoolsforregress StandardizedandStudentizedresidualsThetermsstandardizedandStudentizedresidualshavemeantdifferentthingstodifferentauthors.InStata,predictdenesthestandardizedresidualasbei=ei=(sp 1�hi)andtheStudentizedresidualasri=ei=(s(i)p 1�hi),wheres(i)istherootmeansquarederrorofaregressionwiththeithobservationremoved.Stata'sdenitionoftheStudentizedresidualisthesameastheonegiveninBollenandJackman(1990,264)andiswhatChatterjeeandHadi(1988,74)callthe“externallyStudentized”residual.Stata's“standardized”residualisthesameaswhatChatterjeeandHadi(1988,74)callthe“internallyStudentized”residual.StandardizedandStudentizedresidualsareattemptstoadjustresidualsfortheirstandarderrors.Althoughtheitheoreticalresidualsarehomoskedasticbyassumption(thatis,theyallhavethesamevariance),thecalculatedeiarenot.Infact,Var(ei)=2(1�hi)wherehiaretheleveragemeasuresobtainedfromthediagonalelementsofhatmatrix.Thusobservationswiththegreatestleveragehavecorrespondingresidualswiththesmallestvariance.Standardizedresidualsusetherootmeansquarederroroftheregressionfor.Studentizedresidualsusetherootmeansquarederrorofaregressionomittingtheobservationinquestionfor.Ingeneral,Studentizedresidualsarepreferabletostandardizedresidualsforpurposesofoutlieridentication.Studentizedresidualscanbeinterpretedasthetstatisticfortestingthesignicanceofadummyvariableequalto1intheobservationinquestionand0elsewhere(Belsley,Kuh,andWelsch1980).Suchadummyvariablewouldeffectivelyabsorbtheobservationandsoremoveitsinuenceindeterminingtheothercoefcientsinthemodel.Cautionmustbeexercisedhere,however,becauseofthesimultaneoustestingproblem.Youcannotsimplylisttheresidualsthatwouldbeindividuallysignicantatthe5%level—theirjointsignicancewouldbefarless(theirjointsignicancelevelwouldbefargreater). Example5:standardizedandStudentizedresidualsIntheTerminologysectionofRemarksandexamplesforpredict,wedistinguishedresidualsfromleverageandspeculatedontheimpactofanobservationwithasmallresidualbutlargeleverage.Ifweadjusttheresidualsfortheirstandarderrors,however,theadjustedresidualwouldbe(relatively)largerandperhapslargeenoughsothatwecouldsimplyexaminetheadjustedresiduals.Takingourpriceonweightandforeign##c.mpgmodelfromexample1of[R]regresspostestimationdiagnosticplots,wecanobtainthein-samplestandardizedandStudentizedresidualsbytyping.usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted).predictestaife(sample),rstandard.predictestuife(sample),rstudent 12regresspostestimation—Postestimationtoolsforregress Example6:DFITSinuencemeasureContinuingwithourmodelofpriceonweightandforeign##c.mpg,wecanobtaintheDFITSinuencemeasure:.predicteife(sample),resid.predictdfits,dfitsWedidnotspecifyife(sample)incomputingtheDFITSstatistic.DFITSisavailableonlyovertheestimationsample,sospecifyingife(sample)wouldhavebeenredundant.Itwouldhavedonenoharm,butitwouldnothavechangedtheresults.Ourmodelhask=5independentvariables(kincludestheconstant)andn=74observations;followingthe2p k=ncutoffadvice,wetype.listmakepriceedfitsifabs(dfits)�2*sqrt(5/74),divider make price e dfits 12. Cad.Eldorado 14,500 7271.96 .9564455 13. Cad.Seville 15,906 5036.348 1.356619 24. FordFiesta 4,389 3164.872 .5724172 27. Linc.MarkV 13,594 3109.193 .5200413 28. Linc.Versailles 13,466 6560.912 .8760136 42. Plym.Arrow 4,647 -3312.968 -.9384231 WecalculateCook'sdistanceandlisttheobservationsgreaterthanthesuggested4=ncutoff:.predictcooksdife(sample),cooksd.listmakepriceecooksdifcooksd�4/74,divider make price e cooksd 12. Cad.Eldorado 14,500 7271.96 .1492676 13. Cad.Seville 15,906 5036.348 .3328515 24. FordFiesta 4,389 3164.872 .0638815 28. Linc.Versailles 13,466 6560.912 .1308004 42. Plym.Arrow 4,647 -3312.968 .1700736 Hereweusedife(sample)becauseCook'sdistanceisnotrestrictedtotheestimationsamplebydefault.Itisworthcomparingthislistwiththeprecedingone.Finally,weuseWelschdistanceandthesuggested3p kcutoff:.predictwd,welsch.listmakepriceewdifabs(wd)�3*sqrt(5),divider make price e wd 12. Cad.Eldorado 14,500 7271.96 8.394372 13. Cad.Seville 15,906 5036.348 12.81125 28. Linc.Versailles 13,466 6560.912 7.703005 42. Plym.Arrow 4,647 -3312.968 -8.981481 Herewedidnotneedtospecifyife(sample)becausewelschautomaticallyrestrictsthepredictiontotheestimationsample. 14regresspostestimation—Postestimationtoolsforregress DFBETAinuencestatisticsSyntaxfordfbetadfbetaindepvarindepvar:::,stub(name)MenufordfbetaStatistics�Linearmodelsandrelated�Regressiondiagnostics�DFBETAsDescriptionfordfbetadfbetawillcalculateone,morethanone,oralltheDFBETAsafterregress.AlthoughpredictwillalsocalculateDFBETAs,predictcandothisforonlyonevariableatatime.dfbetaisaconveniencetoolforthosewhowanttocalculateDFBETAsformultiplevariables.Thenamesforthenewvariablescreatedarechosenautomaticallyandbeginwiththeletters dfbeta .Optionfordfbetastub(name)speciestheleadingcharactersdfbetausestonamethenewvariablestobegenerated.Thedefaultisstub( dfbeta ).RemarksandexamplesfordfbetaDFBETAsareperhapsthemostdirectinuencemeasureofinteresttomodelbuilders.DFBETAsfocusononecoefcientandmeasurethedifferencebetweentheregressioncoefcientwhentheithobservationisincludedandexcluded,thedifferencebeingscaledbytheestimatedstandarderrorofthecoefcient.Belsley,Kuh,andWelsch(1980,28)suggestobservationswithjDFBETAij�2=p nasdeservingspecialattention,butitisalsocommonpracticetouse1(BollenandJackman1990,267),meaningthattheobservationshiftedtheestimateatleastonestandarderror. Example8:DFBETAsinuencemeasure;thedfbeta()optionUsingourmodelofpriceonweightandforeign##c.mpg,let'srstaskwhichobservationshavethegreatestimpactonthedeterminationofthecoefcienton1.foreign.Wewillusethesuggested2=p ncutoff:.usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted) 16regresspostestimation—Postestimationtoolsforregress .dfbeta_dfbeta_7:dfbeta(weight)_dfbeta_8:dfbeta(1.foreign)_dfbeta_9:dfbeta(mpg)_dfbeta_10:dfbeta(1.foreign#c.mpg).dfbetampgweight_dfbeta_11:dfbeta(weight)_dfbeta_12:dfbeta(mpg)dfbetawouldhavemadeupdifferentnamesforthenewvariables.dfbetaneverreplacesexistingvariables—itinsteadmakesupadifferentname,soweneedtopayattentiontodfbeta'soutput. TestsforviolationofassumptionsSyntaxforestathettestestathett estvarlist,r hsno rmaljii djfs tatm test(spec)MenuforestatStatistics�Postestimation�ReportsandstatisticsDescriptionforestathettestestathettestperformsthreeversionsoftheBreusch–Pagan(1979)andCook–Weisberg(1983)testforheteroskedasticity.Allthreeversionsofthistestpresentevidenceagainstthenullhypothesisthatt=0inVar(e)=2exp(zt).Inthenormalversion,performedbydefault,thenullhypothesisalsoincludestheassumptionthattheregressiondisturbancesareindependent-normaldrawswithvariance2.Thenormalityassumptionisdroppedfromthenullhypothesisintheiidandfstatversions,whichrespectivelyproducethescoreandFtestsdiscussedinMethodsandformulas.Ifvarlistisnotspecied,thettedvaluesareusedforz.Ifvarlistortherhsoptionisspecied,thevariablesspeciedareusedforz.Optionsforestathettestrhsspeciesthattestsforheteroskedasticitybeperformedfortheright-hand-side(explanatory)variablesofthettedregressionmodel.Therhsoptionmaybecombinedwithavarlist.normal,thedefault,causesestathettesttocomputetheoriginalBreusch–Pagan/Cook–Weisbergtest,whichassumesthattheregressiondisturbancesarenormallydistributed.iidcausesestathettesttocomputetheNR2versionofthescoretestthatdropsthenormalityassumption.fstatcausesestathettesttocomputetheF-statisticversionthatdropsthenormalityassumption. regresspostestimation—Postestimationtoolsforregress17 mtest(spec)speciesthatmultipletestingbeperformed.Theargumentspecieshowp-valuesareadjusted.Thefollowingspecications,spec,aresupported:b onferroniBonferroni'smultipletestingadjustmenth olmHolm'smultipletestingadjustments idakSid´ak'smultipletestingadjustmentnoadj ustnoadjustmentismadeformultipletestingmtestmaybespeciedwithoutanargument.Thisisequivalenttospecifyingmtest(noadjust);thatis,testsfortheindividualvariablesshouldbeperformedwithunadjustedp-values.Bydefault,estathettestdoesnotperformmultipletesting.mtestmaynotbespeciedwithiidorfstat.Syntaxforestatimtestestatimt est,p reservewh iteMenuforestatStatistics�Postestimation�ReportsandstatisticsDescriptionforestatimtestestatimtestperformsaninformationmatrixtestfortheregressionmodelandanorthogonalde-compositionintotestsforheteroskedasticity,skewness,andkurtosisduetoCameronandTrivedi(1990);White'stestforhomoskedasticityagainstunrestrictedformsofheteroskedasticity(1980)isavailableasanoption.White'stestisusuallysimilartothersttermoftheCameron–Trivedidecomposition.Optionsforestatimtestpreservespeciesthatthedatainmemorybepreserved,allvariablesandcasesthatarenotneededinthecalculationsbedropped,andattheconclusiontheoriginaldataberestored.Thisoptioniscostlyforlargedatasets.However,becauseestatimtesthastoperformanauxiliaryregressiononk(k+1)=2temporaryvariables,wherekisthenumberofregressors,itmaynotbeabletoperformthetestotherwise.whitespeciesthatWhite'soriginalheteroskedasticitytestalsobeperformed.Syntaxforestatovtestestatovt est,r hsMenuforestatStatistics�Postestimation�Reportsandstatistics regresspostestimation—Postestimationtoolsforregress19 Remarksandexamplesforestathettest,estatimtest,estatovtest,andestatszroeterWeintroducesomeregressiondiagnosticcommandsthataredesignedtotestforcertainviolationsthatrvfplot(see[R]regresspostestimationdiagnosticplots)lessformallyattemptstodetect.estatovtestprovidesRamsey'stestforomittedvariables—apatternintheresiduals.estathettestprovidesatestforheteroskedasticity—theincreasingordecreasingvariationintheresidualswithttedvalues,withrespecttotheexplanatoryvariables,orwithrespecttoyetothervariables.Thescoretestimplementedinestathettest(BreuschandPagan1979;CookandWeisberg1983)performsascoretestofthenullhypothesisthatb=0againstthealternativehypothesisofmultiplicativeheteroskedasticity.estatszroeterprovidesaranktestforheteroskedasticity,whichisanalternativetothescoretestcomputedbyestathettest.Finally,estatimtestcomputesaninformationmatrixtest,includinganorthogonaldecompositionintotestsforheteroskedasticity,skewness,andkurtosis(CameronandTrivedi1990).TheheteroskedasticitytestcomputedbyestatimtestissimilartothegeneraltestforheteroskedasticitythatwasproposedbyWhite(1980).CameronandTrivedi(2010,chap.3)discussmostofthesetestsandprovidesmoreexamples. Example10:estatovtest,estathettest,estatszroeter,andestatimtestWeuseourmodelofpriceonweightandforeign##c.mpg..usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted).estatovtestRamseyRESETtestusingpowersofthefittedvaluesofpriceHo:modelhasnoomittedvariablesF(3,66)=7.77Prob�F=0.0002.estathettestBreusch-Pagan/Cook-WeisbergtestforheteroskedasticityHo:ConstantvarianceVariables:fittedvaluesofpricechi2(1)=6.50Prob�chi2=0.0108Testingforheteroskedasticityintheright-hand-sidevariablesisrequestedbyspecifyingtherhsoption.Byspecifyingthemtest(bonferroni)option,werequestthattestsbeconductedforeachofthevariables,withaBonferroniadjustmentforthep-valuestoaccommodateourtestingmultiplehypotheses. regresspostestimation—Postestimationtoolsforregress21 werenotamanual,havingfoundevidenceofomittedvariables,wewouldneverhaveruntheestathettest,estatszroeter,andestatimtestcommands,atleastnotuntilwesolvedtheomitted-variableproblem. Technicalnoteestatovtestandestathettestbothperformtwoavorsoftheirrespectivetests.Bydefault,estatovtestlooksforevidenceofomittedvariablesbyttingtheoriginalmodelaugmentedbyby2,by3,andby4,whicharethettedvaluesfromtheoriginalmodel.Undertheassumptionofnomisspecication,thecoefcientsonthepowersofthettedvalueswillbezero.Withtherhsoption,estatovtestinsteadaugmentstheoriginalmodelwithpowers(secondthroughfourth)oftheexplanatoryvariables(exceptfordummyvariables).estathettest,bydefault,looksforheteroskedasticitybymodelingthevarianceasafunctionofthettedvalues.If,however,wespecifyavariableorvariables,thevariancewillbemodeledasafunctionofthespeciedvariables.Inourexample,ifwehad,apriori,somereasontosuspectheteroskedasticityandthattheheteroskedasticityisafunctionofacar'sweight,thenusingatestthatfocusesonweightwouldbemorepowerfulthanthemoregeneraltestssuchasWhite'stestorthersttermintheCameron–Trivedidecompositiontest.estathettest,bydefault,computestheoriginalBreusch–Pagan/Cook–Weisbergtest,whichincludestheassumptionofnormallydistributederrors.Koenker(1981)derivedanNR2versionofthistestthatdropsthenormalityassumption.Wooldridge(2013)givesanF-statisticversionthatdoesnotrequirethenormalityassumption. Storedresultsforestathettest,estatimtest,andestatovtestestathetteststoresthefollowingresultsforthe(multivariate)scoretestinr():Scalarsr(chi2)2teststatisticr(df)#dffortheasymptotic2distributionunderH0r(p)p-valueestathettest,fstatstoresresultsforthe(multivariate)scoretestinr():Scalarsr(F)teststatisticr(df m)#dfofthetestfortheFdistributionunderH0r(df r)#dfoftheresidualsfortheFdistributionunderH0r(p)p-valueestathettest(ifmtestisspecied)andestatszroeterstorethefollowinginr():Matricesr(mtest)amatrixoftestresults,withrowscorrespondingtotheunivariatetestsmtest[.,1]2teststatisticmtest[.,2]#dfmtest[.,3]unadjustedp-valuemtest[.,4]adjustedp-value(ifanmtest()adjustmentmethodisspecied)Macrosr(mtmethod)adjustmentmethodforp-values regresspostestimation—Postestimationtoolsforregress23 Dataanalystsrelyonthesefactstocheckinformallyforthepresenceofmulticollinearity.estatvif,anothercommandforuseafterregress,calculatesthevarianceinationfactorsandtolerancesforeachoftheindependentvariables.Theoutputshowsthevarianceinationfactorstogetherwiththeirreciprocals.Someanalystscomparethereciprocalswithapredeterminedtolerance.Inthecomparison,ifthereciprocaloftheVIFissmallerthanthetolerance,theassociatedpredictorvariableisremovedfromtheregressionmodel.However,mostanalystsrelyoninformalrulesofthumbappliedtotheVIF;seeChatterjeeandHadi(2012).Accordingtotheserules,thereisevidenceofmulticollinearityif1.ThelargestVIFisgreaterthan10(somechooseamoreconservativethresholdvalueof30).2.ThemeanofalltheVIFsisconsiderablylargerthan1. Example11:estatvifWeexaminearegressionmodeltusingtheubiquitousautomobiledataset:.usehttp://www.stata-press.com/data/r13/auto(1978AutomobileData).regresspricempgrep78trunkheadroomlengthturndisplgear_ratioSource SSdfMSNumberofobs=69 F(8,60)=6.33Model 264102049833012756.2Prob�F=0.0000Residual 312694909605211581.82R-squared=0.4579 AdjR-squared=0.3856Total 576796959688482308.22RootMSE=2282.9 price Coef.Std.Err.t�P|t|[95%Conf.Interval] mpg -144.8482.12751-1.760.083-309.119519.43948rep78 727.5783337.61072.160.03552.256381402.9trunk 44.02061108.1410.410.685-172.2935260.3347headroom -807.0996435.5802-1.850.069-1678.3964.19062length -8.68891434.89848-0.250.804-78.4962661.11843turn -177.9064137.3455-1.300.200-452.638396.82551displacement 30.731467.5769524.060.00015.575345.88762gear_ratio 1500.1191110.9591.350.182-722.13033722.368_cons 6691.9767457.9060.900.373-8226.05821610.01 .estatvifVariable VIF1/VIF length 8.220.121614displacement 6.500.153860turn 4.850.205997gear_ratio 3.450.290068mpg 3.030.330171trunk 2.880.347444headroom 1.800.554917rep78 1.460.686147 MeanVIF 4.02Theresultsaremixed.AlthoughwehavenoVIFsgreaterthan10,themeanVIFisgreaterthan1,thoughnotconsiderablyso.Wecouldcontinuetheinvestigationofcollinearity,butgiventhatotherauthorsadvisethatcollinearityisaproblemonlywhenVIFsexistthataregreaterthan30(contradictingourruleabove),wewillnotdosohere. regresspostestimation—Postestimationtoolsforregress25 .estatvifVariable VIF1/VIF midarm 1.010.992831thigh 1.010.992831 MeanVIF 1.01Notehowthecoefcientschangeandhowtheestimatedstandarderrorsforeachoftheregressioncoefcientsbecomemuchsmaller.ThecalculatedvalueofR2fortheoverallregressionforthesubsetmodeldoesnotappreciablydeclinewhenweremovethecorrelatedpredictor.Removinganindependentvariablefromthemodelisonewaytodealwithmulticollinearity.Othermethodsincluderidgeregression,weightedleastsquares,andrestrictingtheuseofthettedmodeltodatathatfollowthesamepatternofmulticollinearity.Ineconomicstudies,itissometimespossibletoestimatetheregressioncoefcientsfromdifferentsubsetsofthedatabyusingcross-sectionandtimeseries. AllexamplesabovedemonstratedtheuseofcenteredVIFs.AspointedoutbyBelsley(1991),thecenteredVIFsmayfailtodiscovercollinearityinvolvingtheconstantterm.OnesolutionistousetheuncenteredVIFsinstead.AccordingtothedenitionoftheuncenteredVIFs,theconstantisviewedasalegitimateexplanatoryvariableinaregressionmodel,whichallowsonetoobtaintheVIFvaluefortheconstantterm. Example13:estatvif,withstrongevidenceofcollinearitywiththeconstanttermConsidertheextremeexampleinwhichoneoftheregressorsishighlycorrelatedwiththeconstant.WesimulatethedataandexaminebothcenteredanduncenteredVIFdiagnosticsafterttedregressionmodelasfollows..usehttp://www.stata-press.com/data/r13/extreme_collin.regressyonexzSource SSdfMSNumberofobs=100 F(3,96)=2710.27Model 223801.985374600.6617Prob�F=0.0000Residual 2642.421249627.5252213R-squared=0.9883 AdjR-squared=0.9880Total 226444.406992287.31723RootMSE=5.2464 y Coef.Std.Err.t�P|t|[95%Conf.Interval] one -3.27858210.5621-0.310.757-24.2441917.68702x 2.038696.024267384.010.0001.9905262.086866z 4.863137.268103618.140.0004.3309565.395319_cons 9.76007510.509350.930.355-11.1008230.62097 .estatvifVariable VIF1/VIF z 1.030.968488x 1.030.971307one 1.000.995425 MeanVIF 1.02 28regresspostestimation—Postestimationtoolsforregress .estatesizeEffectsizesforlinearmodels Source Eta-Squareddf[95%Conf.Interval] Model .12357363.0399862.2041365 smoke .07693451.0193577.1579213race .09083942.0233037.1700334 Theomegaoptioncausesestatesizetoreport!2andpartial!2..estatesize,omegaEffectsizesforlinearmodels Source Omega-Squareddf[95%Conf.Interval] Model .10936133.0244184.1912306 smoke .07194491.0140569.1533695race .08101062.0127448.1610608 Example15:CalculatingeffectsizeforanANOVAmodelWecanuseestatesizeafterANOVAmodelsaswell..anovabwtsmokeraceNumberofobs=189R-squared=0.1236RootMSE=687.999AdjR-squared=0.1094Source PartialSSdfMSFProb�F Model 12346897.634115632.548.690.0000 smoke 7298536.5717298536.5715.420.0001race 8749453.324374726.659.240.0001 Residual 87568400.9185473342.708 Total 99915298.6188531464.354.estatesizeEffectsizesforlinearmodels Source Eta-Squareddf[95%Conf.Interval] Model .12357363.0399862.2041365 smoke .07693451.0193577.1579213race .09083942.0233037.1700334 30regresspostestimation—Postestimationtoolsforregress DenotethepreviouslyestimatedcoefcientvectorbybanditsestimatedvariancematrixbyV.predictworksbyrecallingvariousaspectsofthemodel,suchasb,andcombiningthatinformationwiththedatacurrentlyinmemory.Letxjbethejthobservationcurrentlyinmemory,andlets2bethemeansquarederroroftheregression.Iftheuserspeciedweightsinregress,thenX0XinthefollowingformulasisreplacedbyX0DX,whereDisdenedinCoefcientestimationandANOVAtableunderMethodsandformulasin[R]regress.LetV=s2(X0X)�1.Letkbethenumberofindependentvariablesincludingtheintercept,ifany,andletyjbetheobservedvalueofthedependentvariable.Thepredictedvalue(xboption)isdenedasbyj=xjb.Let`jrepresentalowerboundforanobservationjandujrepresentanupperbound.Theprobabilitythatyjjxjwouldbeobservedintheinterval(`j;uj)—thepr(`,u)option—isP(`j;uj)=Pr(`jxjb+ejuj)=uj�byj s�`j�byj swhereforthepr(`,u),e(`,u),andystar(`,u)options,`jandujcanbeanywhereintherange(�1;+1).Theoptione(`,u)computestheexpectedvalueofyjjxjconditionalonyjjxjbeingintheinterval(`j;uj),thatis,whenyjjxjistruncated.ItcanbeexpressedasE(`j;uj)=E(xjb+ejj`jxjb+ejuj)=byj�suj�byj s�`j�byj s uj�byj s�`j�byj swhereisthenormaldensityandisthecumulativenormal.Youcanalsocomputeystar(`;u)—theexpectedvalueofyjjxj,whereyjisassumedcensoredat`janduj:yj=8:`jifxjb+ej`jxjb+uif`jxjb+ejujujifxjb+ejujThiscomputationcanbeexpressedinseveralways,butthemostintuitiveformulationinvolvesacombinationofthetwostatisticsjustdened:yj=P(�1;`j)`j+P(`j;uj)E(`j;uj)+P(uj;+1)ujAdiagonalelementoftheprojectionmatrix(hat)or(leverage)isgivenbyhj=xj(X0X)�1x0jThestandarderroroftheprediction(thestdpoption)isdenedasspj=q xjVx0jandcanalsobewrittenasspj=sp hj.Thestandarderroroftheforecast(stdf)isdenedassfj=sp 1+hj. 32regresspostestimation—Postestimationtoolsforregress Special­interestpostestimationcommandsTheomitted-variabletest(Ramsey1969)reportedbyestatovtesttstheregressionyi=xib+zit+uiandthenperformsastandardFtestoft=0.Thedefaulttestuseszi=(by2i;by3i;by4i).Ifrhsisspecied,zi=(x21i;x31i;x41i;x22i;:::;x4mi).Ineithercase,thevariablesarenormalizedtohaveminimum0andmaximum1beforepowersarecalculated.Thetestforheteroskedasticity(BreuschandPagan1979;CookandWeisberg1983)modelsVar(ei)=2exp(zt),wherezisavariablelistspeciedbytheuser,thelistofright-hand-sidevariables,orthettedvaluesxb .Thetestisoft=0.Mechanically,estathettesttstheaugmentedregressionbe2i=b2=a+zit+vi.TheoriginalBreusch–Pagan/Cook–WeisbergversionofthetestassumesthattheeiarenormallydistributedunderthenullhypothesiswhichimpliesthatthescoreteststatisticSisequaltothemodelsumofsquaresfromtheaugmentedregressiondividedby2.Underthenullhypothesis,Shasthe2distributionwithmdegreesoffreedom,wheremisthenumberofcolumnsofz.Koenker(1981)derivedascoretestofthenullhypothesisthatt=0undertheassumptionthattheeiareindependentandidenticallydistributed(i.i.d.).KoenkershowedthatS=NR2hasalarge-sample2distributionwithmdegreesoffreedom,whereNisthenumberofobservationsandR2istheR-squaredintheaugmentedregressionandmisthenumberofcolumnsofz.estathettest,iidproducesthisversionofthetest.Wooldridge(2013)showedthatanFtestoft=0intheaugmentedregressioncanalsobeusedundertheassumptionthattheeiarei.i.d.estathettest,fstatproducesthisversionofthetest.Szroeter'sclassoftestsforhomoskedasticityagainstthealternativethattheresidualvarianceincreasesinsomevariablexisdenedintermsofH=Pni=1h(xi)e2i Pni=1e2iwhereh(x)issomeweightfunctionthatincreasesinx(Szroeter1978).Hisaweightedaverageoftheh(x),withthesquaredresidualsservingasweights.Underhomoskedasticity,Hshouldbeapproximatelyequaltotheunweightedaverageofh(x).LargevaluesofHsuggestthate2itendstobelargewhereh(x)islarge;thatis,thevarianceindeedincreasesinx,whereassmallvaluesofHsuggestthatthevarianceactuallydecreasesinx.estatszroeterusesh(xi)=rank(xiinx1:::xn);seeJudgeetal.[1985,452]fordetails.estatszroeterdisplaysanormalizedversionofH,Q=r 6n n2�1HwhichisapproximatelyN(0;1)distributedunderthenull(homoskedasticity).estathettestandestatszroeterprovideadjustmentsofp-valuesformultipletesting.Thesupportedmethodsaredescribedin[R]test.estatimtestperformstheinformationmatrixtestfortheregressionmodel,aswellasanorthogonaldecompositionintotestsforheteroskedasticity1,nonnormalskewness2,andnonnormalkurtosis3(CameronandTrivedi1990;LongandTrivedi1993).Thedecompositionisobtainedviathreeauxiliaryregressions.Letebetheregressionresiduals,b2bethemaximumlikelihoodestimateof2intheregression,nbethenumberofobservations,Xbethesetofkvariablesspeciedwithestatimtest,andR2unbetheuncenteredR2fromaregression.1isobtainedasnR2unfromaregressionofe2�b2onthecrossproductsofthevariablesinX.2iscomputedasnR2unfromaregressionofe3�3b2eonX.Finally,3isobtainedasnR2unfromaregressionofe4�6b2e2�3b4 regresspostestimation—Postestimationtoolsforregress35 Kutner,M.H.,C.J.Nachtsheim,andJ.Neter.2004.AppliedLinearRegressionModels.4thed.NewYork:McGraw–Hill/Irwin.Lindsey,C.,andS.J.Sheather.2010a.Optimalpowertransformationviainverseresponseplots.StataJournal10:200–214. .2010b.Modeltassessmentviamarginalmodelplots.StataJournal10:215–225.Long,J.S.,andJ.Freese.2000.sg145:Scalarmeasuresoftforregressionmodels.StataTechnicalBulletin56:34–40.ReprintedinStataTechnicalBulletinReprints,vol.10,pp.197–205.CollegeStation,TX:StataPress.Long,J.S.,andP.K.Trivedi.1993.Somespecicationtestsforthelinearregressionmodel.SociologicalMethodsandResearch21:161–204.ReprintedinTestingStructuralEquationModels,ed.K.A.BollenandJ.S.Long,pp.66–110.NewburyPark,CA:Sage.Peracchi,F.2001.Econometrics.Chichester,UK:Wiley.Ramsey,J.B.1969.Testsforspecicationerrorsinclassicallinearleast-squaresregressionanalysis.JournaloftheRoyalStatisticalSociety,SeriesB31:350–371.Ramsey,J.B.,andP.Schmidt.1976.SomefurtherresultsontheuseofOLSandBLUSresidualsinspecicationerrortests.JournaloftheAmericanStatisticalAssociation71:389–390.Rousseeuw,P.J.,andA.M.Leroy.1987.RobustRegressionandOutlierDetection.NewYork:Wiley.Smithson,M.2001.Correctcondenceintervalsforvariousregressioneffectsizesandparameters:Theimportanceofnoncentraldistributionsincomputingintervals.EducationalandPsychologicalMeasurement61:605–632.Szroeter,J.1978.Aclassofparametrictestsforheteroscedasticityinlineareconometricmodels.Econometrica46:1311–1327.Thompson,B.2006.FoundationsofBehavioralStatistics:AnInsight-BasedApproach.NewYork:GuilfordPress.Velleman,P.F.1986.Comment[onChatterjeeandHadi1986].StatisticalScience1:412–413.Velleman,P.F.,andR.E.Welsch.1981.Efcientcomputingofregressiondiagnostics.AmericanStatistician35:234–242.Weesie,J.2001.sg161:Analysisoftheturningpointofaquadraticspecication.StataTechnicalBulletin60:18–20.ReprintedinStataTechnicalBulletinReprints,vol.10,pp.273–277.CollegeStation,TX:StataPress.Weisberg,S.2005.AppliedLinearRegression.3rded.NewYork:Wiley.Welsch,R.E.1982.Inuencefunctionsandregressiondiagnostics.InModernDataAnalysis,ed.R.L.LaunerandA.F.Siegel,149–169.NewYork:AcademicPress. .1986.Comment[onChatterjeeandHadi1986].StatisticalScience1:403–405.Welsch,R.E.,andE.Kuh.1977.LinearRegressionDiagnostics.TechnicalReport923-77,MassachusettsInstituteofTechnology,Cambridge,MA.White,H.L.,Jr.1980.Aheteroskedasticity-consistentcovariancematrixestimatorandadirecttestforheteroskedasticity.Econometrica48:817–838.Wooldridge,J.M.2013.IntroductoryEconometrics:AModernApproach.5thed.Mason,OH:South-Western.Alsosee[R]regress—Linearregression[R]regresspostestimationdiagnosticplots—Postestimationplotsforregress[R]regresspostestimationtimeseries—Postestimationtoolsforregresswithtimeseries[U]20Estimationandpostestimationcommands

Related Contents


Next Show more