ThefollowingstandardpostestimationcommandsarealsoavailableCommandDescription contrastcontrastsandANOVAstylejointtestsofestimatesestaticAkaikesandSchwarzsBayesianinformationcriteriaAICandBICestat ID: 153438
Download Pdf The PPT/PDF document "2regresspostestimationPostestimati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
2regresspostestimationPostestimationtoolsforregress Thefollowingstandardpostestimationcommandsarealsoavailable:CommandDescription contrastcontrastsandANOVA-stylejointtestsofestimatesestaticAkaike'sandSchwarz'sBayesianinformationcriteria(AICandBIC)estatsummarizesummarystatisticsfortheestimationsampleestatvcevariancecovariancematrixoftheestimators(VCE)estat(svy)postestimationstatisticsforsurveydataestimatescatalogingestimationresultsforecast1dynamicforecastsandsimulationshausmanHausman'sspecicationtestlincompointestimates,standarderrors,testing,andinferenceforlinearcombinationsofcoefcientslinktestlinktestformodelspecicationlrtest2likelihood-ratiotestmarginsmarginalmeans,predictivemargins,marginaleffects,andaveragemarginaleffectsmarginsplotgraphtheresultsfrommargins(proleplots,interactionplots,etc.)nlcompointestimates,standarderrors,testing,andinferencefornonlinearcombinationsofcoefcientspredictpredictions,residuals,inuencestatistics,andotherdiagnosticmeasurespredictnlpointestimates,standarderrors,testing,andinferenceforgeneralizedpredictionspwcomparepairwisecomparisonsofestimatessuestseeminglyunrelatedestimationtestWaldtestsofsimpleandcompositelinearhypothesestestnlWaldtestsofnonlinearhypotheses 1forecastisnotappropriatewithmiorsvyestimationresults.2lrtestisnotappropriatewithsvyestimationresults. 4regresspostestimationPostestimationtoolsforregress rstandardcalculatesthestandardizedresiduals.rstudentcalculatestheStudentized(jackknifed)residuals.cooksdcalculatestheCook'sDinuencestatistic(Cook1977).leverageorhatcalculatesthediagonalelementsoftheprojection(hat)matrix.pr(a,b)calculatesPr(axjb+ujb),theprobabilitythatyjjxjwouldbeobservedintheinterval(a;b).aandbmaybespeciedasnumbersorvariablenames;lbandubarevariablenames;pr(20,30)calculatesPr(20xjb+uj30);pr(lb,ub)calculatesPr(lbxjb+ujub);andpr(20,ub)calculatesPr(20xjb+ujub).amissing(a:)means1;pr(.,30)calculatesPr(1xjb+uj30);pr(lb,30)calculatesPr(1xjb+uj30)inobservationsforwhichlb:andcalculatesPr(lbxjb+uj30)elsewhere.bmissing(b:)means+1;pr(20,.)calculatesPr(+1]TJ/;ྒ ; .56;A T; 10;.516; 0 T; [0;xjb+uj]TJ/;ྒ ; .56;A T; 10;.516; 0 T; [0;20);pr(20,ub)calculatesPr(+1]TJ/;ྒ ; .56;A T; 10;.516; 0 T; [0;xjb+uj]TJ/;ྒ ; .56;A T; 10;.516; 0 T; [0;20)inobservationsforwhichub:andcalculatesPr(20xjb+ujub)elsewhere.e(a,b)calculatesE(xjb+ujjaxjb+ujb),theexpectedvalueofyjjxjconditionalonyjjxjbeingintheinterval(a;b),meaningthatyjjxjistruncated.aandbarespeciedastheyareforpr().ystar(a,b)calculatesE(yj),whereyj=aifxjb+uja,yj=bifxjb+ujb,andyj=xjb+ujotherwise,meaningthatyjiscensored.aandbarespeciedastheyareforpr().dfbeta(varname)calculatestheDFBETAforvarname,thedifferencebetweentheregressioncoefcientwhenthejthobservationisincludedandexcluded,saiddifferencebeingscaledbytheestimatedstandarderrorofthecoefcient.varnamemusthavebeenincludedamongtheregressorsinthepreviouslyttedmodel.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.stdpcalculatesthestandarderroroftheprediction,whichcanbethoughtofasthestandarderrorofthepredictedexpectedvalueormeanfortheobservation'scovariatepattern.Thestandarderrorofthepredictionisalsoreferredtoasthestandarderrorofthettedvalue.stdfcalculatesthestandarderroroftheforecast,whichisthestandarderrorofthepointpredictionfor1observation.Itiscommonlyreferredtoasthestandarderrorofthefutureorforecastvalue.Byconstruction,thestandarderrorsproducedbystdfarealwayslargerthanthoseproducedbystdp;seeMethodsandformulas.stdrcalculatesthestandarderroroftheresiduals.covratiocalculatesCOVRATIO(Belsley,Kuh,andWelsch1980),ameasureoftheinuenceofthejthobservationbasedonconsideringtheeffectonthevariancecovariancematrixoftheestimates.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.dfitscalculatesDFITS(WelschandKuh1977)andattemptstosummarizetheinformationintheleverageversusresidual-squaredplotintoonestatistic.Thecalculationisautomaticallyrestrictedtotheestimationsubsample.welschcalculatesWelschdistance(Welsch1982)andisavariationondfits.Thecalculationisautomaticallyrestrictedtotheestimationsubsample. 6regresspostestimationPostestimationtoolsforregress Nowallthisisamostunsatisfactorystateofaffairs.Pointswithlargeresidualsmay,butneednot,havealargeeffectonourresults,andpointswithsmallresidualsmaystillhavealargeeffect.Pointswithhighleveragemay,butneednot,havealargeeffectonourresults,andpointswithlowleveragemaystillhavealargeeffect.Canyounotidentifytheinuentialpointsandsimplyhavethecomputerlistthemforyou?Youcan,butyouwillhavetodenewhatyoumeanbyinuential.Inuentialisdenedwithrespecttosomestatistic.Forinstance,youmightaskwhichpointsinyourdatahavealargeeffectonyourestimateda,whichpointshavealargeeffectonyourestimatedb,whichpointshavealargeeffectonyourestimatedstandarderrorofb,andsoon,butdonotbesurprisedwhentheanswerstothesequestionsaredifferent.Inanycase,obtainingsuchmeasuresisnotdifcultallyouhavetodoisttheregressionexcludingeachobservationoneatatimeandrecordthestatisticofinterestwhich,inthedayofthemoderncomputer,isnottooonerous.Moreover,youcansaveconsiderablecomputertimebydoingalgebraaheadoftimeandworkingoutformulasthatwillcalculatethesameanswersasifyouraneachoftheregressions.(Ignorethequestionofpairsofobservationsthat,together,exertundueinuence,andtriples,andsoon,whichremainslargelyunsolvedandforwhichthebruteforcet-every-possible-regressionprocedureisnotaviablealternative.)FittedvaluesandresidualsTypingpredictnewvarwithnooptionscreatesnewvarcontainingthettedvalues.Typingpredictnewvar,residcreatesnewvarcontainingtheresiduals. Example1Continuingwithexample1from[R]regress,wewishtotthefollowingmodel:mpg=0+1weight+2foreign+.usehttp://www.stata-press.com/data/r13/auto(1978AutomobileData).regressmpgweightforeignSource SSdfMSNumberofobs=74 F(2,71)=69.75Model 1619.28772809.643849ProbF=0.0000Residual 824.1717617111.608053R-squared=0.6627 AdjR-squared=0.6532Total 2443.459467333.4720474RootMSE=3.4071 mpg Coef.Std.Err.tP|t|[95%Conf.Interval] weight -.0065879.0006371-10.340.000-.0078583-.0053175foreign -1.6500291.075994-1.530.130-3.7955.4954422_cons 41.67972.16554719.250.00037.3617245.99768 Thatdone,wecannowobtainthepredictedvaluesfromtheregression.Wewillstoretheminanewvariablecalledpmpgbytypingpredictpmpg.Becausepredictproducesnooutput,wewillfollowthatbysummarizingourpredictedandobservedvalues. 10regresspostestimationPostestimationtoolsforregress StandardizedandStudentizedresidualsThetermsstandardizedandStudentizedresidualshavemeantdifferentthingstodifferentauthors.InStata,predictdenesthestandardizedresidualasbei=ei=(sp 1hi)andtheStudentizedresidualasri=ei=(s(i)p 1hi),wheres(i)istherootmeansquarederrorofaregressionwiththeithobservationremoved.Stata'sdenitionoftheStudentizedresidualisthesameastheonegiveninBollenandJackman(1990,264)andiswhatChatterjeeandHadi(1988,74)calltheexternallyStudentizedresidual.Stata'sstandardizedresidualisthesameaswhatChatterjeeandHadi(1988,74)calltheinternallyStudentizedresidual.StandardizedandStudentizedresidualsareattemptstoadjustresidualsfortheirstandarderrors.Althoughtheitheoreticalresidualsarehomoskedasticbyassumption(thatis,theyallhavethesamevariance),thecalculatedeiarenot.Infact,Var(ei)=2(1hi)wherehiaretheleveragemeasuresobtainedfromthediagonalelementsofhatmatrix.Thusobservationswiththegreatestleveragehavecorrespondingresidualswiththesmallestvariance.Standardizedresidualsusetherootmeansquarederroroftheregressionfor.Studentizedresidualsusetherootmeansquarederrorofaregressionomittingtheobservationinquestionfor.Ingeneral,Studentizedresidualsarepreferabletostandardizedresidualsforpurposesofoutlieridentication.Studentizedresidualscanbeinterpretedasthetstatisticfortestingthesignicanceofadummyvariableequalto1intheobservationinquestionand0elsewhere(Belsley,Kuh,andWelsch1980).Suchadummyvariablewouldeffectivelyabsorbtheobservationandsoremoveitsinuenceindeterminingtheothercoefcientsinthemodel.Cautionmustbeexercisedhere,however,becauseofthesimultaneoustestingproblem.Youcannotsimplylisttheresidualsthatwouldbeindividuallysignicantatthe5%leveltheirjointsignicancewouldbefarless(theirjointsignicancelevelwouldbefargreater). Example5:standardizedandStudentizedresidualsIntheTerminologysectionofRemarksandexamplesforpredict,wedistinguishedresidualsfromleverageandspeculatedontheimpactofanobservationwithasmallresidualbutlargeleverage.Ifweadjusttheresidualsfortheirstandarderrors,however,theadjustedresidualwouldbe(relatively)largerandperhapslargeenoughsothatwecouldsimplyexaminetheadjustedresiduals.Takingourpriceonweightandforeign##c.mpgmodelfromexample1of[R]regresspostestimationdiagnosticplots,wecanobtainthein-samplestandardizedandStudentizedresidualsbytyping.usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted).predictestaife(sample),rstandard.predictestuife(sample),rstudent 12regresspostestimationPostestimationtoolsforregress Example6:DFITSinuencemeasureContinuingwithourmodelofpriceonweightandforeign##c.mpg,wecanobtaintheDFITSinuencemeasure:.predicteife(sample),resid.predictdfits,dfitsWedidnotspecifyife(sample)incomputingtheDFITSstatistic.DFITSisavailableonlyovertheestimationsample,sospecifyingife(sample)wouldhavebeenredundant.Itwouldhavedonenoharm,butitwouldnothavechangedtheresults.Ourmodelhask=5independentvariables(kincludestheconstant)andn=74observations;followingthe2p k=ncutoffadvice,wetype.listmakepriceedfitsifabs(dfits)2*sqrt(5/74),divider make price e dfits 12. Cad.Eldorado 14,500 7271.96 .9564455 13. Cad.Seville 15,906 5036.348 1.356619 24. FordFiesta 4,389 3164.872 .5724172 27. Linc.MarkV 13,594 3109.193 .5200413 28. Linc.Versailles 13,466 6560.912 .8760136 42. Plym.Arrow 4,647 -3312.968 -.9384231 WecalculateCook'sdistanceandlisttheobservationsgreaterthanthesuggested4=ncutoff:.predictcooksdife(sample),cooksd.listmakepriceecooksdifcooksd4/74,divider make price e cooksd 12. Cad.Eldorado 14,500 7271.96 .1492676 13. Cad.Seville 15,906 5036.348 .3328515 24. FordFiesta 4,389 3164.872 .0638815 28. Linc.Versailles 13,466 6560.912 .1308004 42. Plym.Arrow 4,647 -3312.968 .1700736 Hereweusedife(sample)becauseCook'sdistanceisnotrestrictedtotheestimationsamplebydefault.Itisworthcomparingthislistwiththeprecedingone.Finally,weuseWelschdistanceandthesuggested3p kcutoff:.predictwd,welsch.listmakepriceewdifabs(wd)3*sqrt(5),divider make price e wd 12. Cad.Eldorado 14,500 7271.96 8.394372 13. Cad.Seville 15,906 5036.348 12.81125 28. Linc.Versailles 13,466 6560.912 7.703005 42. Plym.Arrow 4,647 -3312.968 -8.981481 Herewedidnotneedtospecifyife(sample)becausewelschautomaticallyrestrictsthepredictiontotheestimationsample. 14regresspostestimationPostestimationtoolsforregress DFBETAinuencestatisticsSyntaxfordfbetadfbetaindepvarindepvar:::,stub(name)MenufordfbetaStatisticsLinearmodelsandrelatedRegressiondiagnosticsDFBETAsDescriptionfordfbetadfbetawillcalculateone,morethanone,oralltheDFBETAsafterregress.AlthoughpredictwillalsocalculateDFBETAs,predictcandothisforonlyonevariableatatime.dfbetaisaconveniencetoolforthosewhowanttocalculateDFBETAsformultiplevariables.Thenamesforthenewvariablescreatedarechosenautomaticallyandbeginwiththeletters dfbeta .Optionfordfbetastub(name)speciestheleadingcharactersdfbetausestonamethenewvariablestobegenerated.Thedefaultisstub( dfbeta ).RemarksandexamplesfordfbetaDFBETAsareperhapsthemostdirectinuencemeasureofinteresttomodelbuilders.DFBETAsfocusononecoefcientandmeasurethedifferencebetweentheregressioncoefcientwhentheithobservationisincludedandexcluded,thedifferencebeingscaledbytheestimatedstandarderrorofthecoefcient.Belsley,Kuh,andWelsch(1980,28)suggestobservationswithjDFBETAij2=p nasdeservingspecialattention,butitisalsocommonpracticetouse1(BollenandJackman1990,267),meaningthattheobservationshiftedtheestimateatleastonestandarderror. Example8:DFBETAsinuencemeasure;thedfbeta()optionUsingourmodelofpriceonweightandforeign##c.mpg,let'srstaskwhichobservationshavethegreatestimpactonthedeterminationofthecoefcienton1.foreign.Wewillusethesuggested2=p ncutoff:.usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted) 16regresspostestimationPostestimationtoolsforregress .dfbeta_dfbeta_7:dfbeta(weight)_dfbeta_8:dfbeta(1.foreign)_dfbeta_9:dfbeta(mpg)_dfbeta_10:dfbeta(1.foreign#c.mpg).dfbetampgweight_dfbeta_11:dfbeta(weight)_dfbeta_12:dfbeta(mpg)dfbetawouldhavemadeupdifferentnamesforthenewvariables.dfbetaneverreplacesexistingvariablesitinsteadmakesupadifferentname,soweneedtopayattentiontodfbeta'soutput. TestsforviolationofassumptionsSyntaxforestathettestestathett estvarlist,r hsno rmaljii djfs tatm test(spec)MenuforestatStatisticsPostestimationReportsandstatisticsDescriptionforestathettestestathettestperformsthreeversionsoftheBreuschPagan(1979)andCookWeisberg(1983)testforheteroskedasticity.Allthreeversionsofthistestpresentevidenceagainstthenullhypothesisthatt=0inVar(e)=2exp(zt).Inthenormalversion,performedbydefault,thenullhypothesisalsoincludestheassumptionthattheregressiondisturbancesareindependent-normaldrawswithvariance2.Thenormalityassumptionisdroppedfromthenullhypothesisintheiidandfstatversions,whichrespectivelyproducethescoreandFtestsdiscussedinMethodsandformulas.Ifvarlistisnotspecied,thettedvaluesareusedforz.Ifvarlistortherhsoptionisspecied,thevariablesspeciedareusedforz.Optionsforestathettestrhsspeciesthattestsforheteroskedasticitybeperformedfortheright-hand-side(explanatory)variablesofthettedregressionmodel.Therhsoptionmaybecombinedwithavarlist.normal,thedefault,causesestathettesttocomputetheoriginalBreuschPagan/CookWeisbergtest,whichassumesthattheregressiondisturbancesarenormallydistributed.iidcausesestathettesttocomputetheNR2versionofthescoretestthatdropsthenormalityassumption.fstatcausesestathettesttocomputetheF-statisticversionthatdropsthenormalityassumption. regresspostestimationPostestimationtoolsforregress17 mtest(spec)speciesthatmultipletestingbeperformed.Theargumentspecieshowp-valuesareadjusted.Thefollowingspecications,spec,aresupported:b onferroniBonferroni'smultipletestingadjustmenth olmHolm'smultipletestingadjustments idakSid´ak'smultipletestingadjustmentnoadj ustnoadjustmentismadeformultipletestingmtestmaybespeciedwithoutanargument.Thisisequivalenttospecifyingmtest(noadjust);thatis,testsfortheindividualvariablesshouldbeperformedwithunadjustedp-values.Bydefault,estathettestdoesnotperformmultipletesting.mtestmaynotbespeciedwithiidorfstat.Syntaxforestatimtestestatimt est,p reservewh iteMenuforestatStatisticsPostestimationReportsandstatisticsDescriptionforestatimtestestatimtestperformsaninformationmatrixtestfortheregressionmodelandanorthogonalde-compositionintotestsforheteroskedasticity,skewness,andkurtosisduetoCameronandTrivedi(1990);White'stestforhomoskedasticityagainstunrestrictedformsofheteroskedasticity(1980)isavailableasanoption.White'stestisusuallysimilartothersttermoftheCameronTrivedidecomposition.Optionsforestatimtestpreservespeciesthatthedatainmemorybepreserved,allvariablesandcasesthatarenotneededinthecalculationsbedropped,andattheconclusiontheoriginaldataberestored.Thisoptioniscostlyforlargedatasets.However,becauseestatimtesthastoperformanauxiliaryregressiononk(k+1)=2temporaryvariables,wherekisthenumberofregressors,itmaynotbeabletoperformthetestotherwise.whitespeciesthatWhite'soriginalheteroskedasticitytestalsobeperformed.Syntaxforestatovtestestatovt est,r hsMenuforestatStatisticsPostestimationReportsandstatistics regresspostestimationPostestimationtoolsforregress19 Remarksandexamplesforestathettest,estatimtest,estatovtest,andestatszroeterWeintroducesomeregressiondiagnosticcommandsthataredesignedtotestforcertainviolationsthatrvfplot(see[R]regresspostestimationdiagnosticplots)lessformallyattemptstodetect.estatovtestprovidesRamsey'stestforomittedvariablesapatternintheresiduals.estathettestprovidesatestforheteroskedasticitytheincreasingordecreasingvariationintheresidualswithttedvalues,withrespecttotheexplanatoryvariables,orwithrespecttoyetothervariables.Thescoretestimplementedinestathettest(BreuschandPagan1979;CookandWeisberg1983)performsascoretestofthenullhypothesisthatb=0againstthealternativehypothesisofmultiplicativeheteroskedasticity.estatszroeterprovidesaranktestforheteroskedasticity,whichisanalternativetothescoretestcomputedbyestathettest.Finally,estatimtestcomputesaninformationmatrixtest,includinganorthogonaldecompositionintotestsforheteroskedasticity,skewness,andkurtosis(CameronandTrivedi1990).TheheteroskedasticitytestcomputedbyestatimtestissimilartothegeneraltestforheteroskedasticitythatwasproposedbyWhite(1980).CameronandTrivedi(2010,chap.3)discussmostofthesetestsandprovidesmoreexamples. Example10:estatovtest,estathettest,estatszroeter,andestatimtestWeuseourmodelofpriceonweightandforeign##c.mpg..usehttp://www.stata-press.com/data/r13/auto,clear(1978AutomobileData).regresspriceweightforeign##c.mpg(outputomitted).estatovtestRamseyRESETtestusingpowersofthefittedvaluesofpriceHo:modelhasnoomittedvariablesF(3,66)=7.77ProbF=0.0002.estathettestBreusch-Pagan/Cook-WeisbergtestforheteroskedasticityHo:ConstantvarianceVariables:fittedvaluesofpricechi2(1)=6.50Probchi2=0.0108Testingforheteroskedasticityintheright-hand-sidevariablesisrequestedbyspecifyingtherhsoption.Byspecifyingthemtest(bonferroni)option,werequestthattestsbeconductedforeachofthevariables,withaBonferroniadjustmentforthep-valuestoaccommodateourtestingmultiplehypotheses. regresspostestimationPostestimationtoolsforregress21 werenotamanual,havingfoundevidenceofomittedvariables,wewouldneverhaveruntheestathettest,estatszroeter,andestatimtestcommands,atleastnotuntilwesolvedtheomitted-variableproblem. Technicalnoteestatovtestandestathettestbothperformtwoavorsoftheirrespectivetests.Bydefault,estatovtestlooksforevidenceofomittedvariablesbyttingtheoriginalmodelaugmentedbyby2,by3,andby4,whicharethettedvaluesfromtheoriginalmodel.Undertheassumptionofnomisspecication,thecoefcientsonthepowersofthettedvalueswillbezero.Withtherhsoption,estatovtestinsteadaugmentstheoriginalmodelwithpowers(secondthroughfourth)oftheexplanatoryvariables(exceptfordummyvariables).estathettest,bydefault,looksforheteroskedasticitybymodelingthevarianceasafunctionofthettedvalues.If,however,wespecifyavariableorvariables,thevariancewillbemodeledasafunctionofthespeciedvariables.Inourexample,ifwehad,apriori,somereasontosuspectheteroskedasticityandthattheheteroskedasticityisafunctionofacar'sweight,thenusingatestthatfocusesonweightwouldbemorepowerfulthanthemoregeneraltestssuchasWhite'stestorthersttermintheCameronTrivedidecompositiontest.estathettest,bydefault,computestheoriginalBreuschPagan/CookWeisbergtest,whichincludestheassumptionofnormallydistributederrors.Koenker(1981)derivedanNR2versionofthistestthatdropsthenormalityassumption.Wooldridge(2013)givesanF-statisticversionthatdoesnotrequirethenormalityassumption. Storedresultsforestathettest,estatimtest,andestatovtestestathetteststoresthefollowingresultsforthe(multivariate)scoretestinr():Scalarsr(chi2)2teststatisticr(df)#dffortheasymptotic2distributionunderH0r(p)p-valueestathettest,fstatstoresresultsforthe(multivariate)scoretestinr():Scalarsr(F)teststatisticr(df m)#dfofthetestfortheFdistributionunderH0r(df r)#dfoftheresidualsfortheFdistributionunderH0r(p)p-valueestathettest(ifmtestisspecied)andestatszroeterstorethefollowinginr():Matricesr(mtest)amatrixoftestresults,withrowscorrespondingtotheunivariatetestsmtest[.,1]2teststatisticmtest[.,2]#dfmtest[.,3]unadjustedp-valuemtest[.,4]adjustedp-value(ifanmtest()adjustmentmethodisspecied)Macrosr(mtmethod)adjustmentmethodforp-values regresspostestimationPostestimationtoolsforregress23 Dataanalystsrelyonthesefactstocheckinformallyforthepresenceofmulticollinearity.estatvif,anothercommandforuseafterregress,calculatesthevarianceinationfactorsandtolerancesforeachoftheindependentvariables.Theoutputshowsthevarianceinationfactorstogetherwiththeirreciprocals.Someanalystscomparethereciprocalswithapredeterminedtolerance.Inthecomparison,ifthereciprocaloftheVIFissmallerthanthetolerance,theassociatedpredictorvariableisremovedfromtheregressionmodel.However,mostanalystsrelyoninformalrulesofthumbappliedtotheVIF;seeChatterjeeandHadi(2012).Accordingtotheserules,thereisevidenceofmulticollinearityif1.ThelargestVIFisgreaterthan10(somechooseamoreconservativethresholdvalueof30).2.ThemeanofalltheVIFsisconsiderablylargerthan1. Example11:estatvifWeexaminearegressionmodeltusingtheubiquitousautomobiledataset:.usehttp://www.stata-press.com/data/r13/auto(1978AutomobileData).regresspricempgrep78trunkheadroomlengthturndisplgear_ratioSource SSdfMSNumberofobs=69 F(8,60)=6.33Model 264102049833012756.2ProbF=0.0000Residual 312694909605211581.82R-squared=0.4579 AdjR-squared=0.3856Total 576796959688482308.22RootMSE=2282.9 price Coef.Std.Err.tP|t|[95%Conf.Interval] mpg -144.8482.12751-1.760.083-309.119519.43948rep78 727.5783337.61072.160.03552.256381402.9trunk 44.02061108.1410.410.685-172.2935260.3347headroom -807.0996435.5802-1.850.069-1678.3964.19062length -8.68891434.89848-0.250.804-78.4962661.11843turn -177.9064137.3455-1.300.200-452.638396.82551displacement 30.731467.5769524.060.00015.575345.88762gear_ratio 1500.1191110.9591.350.182-722.13033722.368_cons 6691.9767457.9060.900.373-8226.05821610.01 .estatvifVariable VIF1/VIF length 8.220.121614displacement 6.500.153860turn 4.850.205997gear_ratio 3.450.290068mpg 3.030.330171trunk 2.880.347444headroom 1.800.554917rep78 1.460.686147 MeanVIF 4.02Theresultsaremixed.AlthoughwehavenoVIFsgreaterthan10,themeanVIFisgreaterthan1,thoughnotconsiderablyso.Wecouldcontinuetheinvestigationofcollinearity,butgiventhatotherauthorsadvisethatcollinearityisaproblemonlywhenVIFsexistthataregreaterthan30(contradictingourruleabove),wewillnotdosohere. regresspostestimationPostestimationtoolsforregress25 .estatvifVariable VIF1/VIF midarm 1.010.992831thigh 1.010.992831 MeanVIF 1.01Notehowthecoefcientschangeandhowtheestimatedstandarderrorsforeachoftheregressioncoefcientsbecomemuchsmaller.ThecalculatedvalueofR2fortheoverallregressionforthesubsetmodeldoesnotappreciablydeclinewhenweremovethecorrelatedpredictor.Removinganindependentvariablefromthemodelisonewaytodealwithmulticollinearity.Othermethodsincluderidgeregression,weightedleastsquares,andrestrictingtheuseofthettedmodeltodatathatfollowthesamepatternofmulticollinearity.Ineconomicstudies,itissometimespossibletoestimatetheregressioncoefcientsfromdifferentsubsetsofthedatabyusingcross-sectionandtimeseries. AllexamplesabovedemonstratedtheuseofcenteredVIFs.AspointedoutbyBelsley(1991),thecenteredVIFsmayfailtodiscovercollinearityinvolvingtheconstantterm.OnesolutionistousetheuncenteredVIFsinstead.AccordingtothedenitionoftheuncenteredVIFs,theconstantisviewedasalegitimateexplanatoryvariableinaregressionmodel,whichallowsonetoobtaintheVIFvaluefortheconstantterm. Example13:estatvif,withstrongevidenceofcollinearitywiththeconstanttermConsidertheextremeexampleinwhichoneoftheregressorsishighlycorrelatedwiththeconstant.WesimulatethedataandexaminebothcenteredanduncenteredVIFdiagnosticsafterttedregressionmodelasfollows..usehttp://www.stata-press.com/data/r13/extreme_collin.regressyonexzSource SSdfMSNumberofobs=100 F(3,96)=2710.27Model 223801.985374600.6617ProbF=0.0000Residual 2642.421249627.5252213R-squared=0.9883 AdjR-squared=0.9880Total 226444.406992287.31723RootMSE=5.2464 y Coef.Std.Err.tP|t|[95%Conf.Interval] one -3.27858210.5621-0.310.757-24.2441917.68702x 2.038696.024267384.010.0001.9905262.086866z 4.863137.268103618.140.0004.3309565.395319_cons 9.76007510.509350.930.355-11.1008230.62097 .estatvifVariable VIF1/VIF z 1.030.968488x 1.030.971307one 1.000.995425 MeanVIF 1.02 28regresspostestimationPostestimationtoolsforregress .estatesizeEffectsizesforlinearmodels Source Eta-Squareddf[95%Conf.Interval] Model .12357363.0399862.2041365 smoke .07693451.0193577.1579213race .09083942.0233037.1700334 Theomegaoptioncausesestatesizetoreport!2andpartial!2..estatesize,omegaEffectsizesforlinearmodels Source Omega-Squareddf[95%Conf.Interval] Model .10936133.0244184.1912306 smoke .07194491.0140569.1533695race .08101062.0127448.1610608 Example15:CalculatingeffectsizeforanANOVAmodelWecanuseestatesizeafterANOVAmodelsaswell..anovabwtsmokeraceNumberofobs=189R-squared=0.1236RootMSE=687.999AdjR-squared=0.1094Source PartialSSdfMSFProbF Model 12346897.634115632.548.690.0000 smoke 7298536.5717298536.5715.420.0001race 8749453.324374726.659.240.0001 Residual 87568400.9185473342.708 Total 99915298.6188531464.354.estatesizeEffectsizesforlinearmodels Source Eta-Squareddf[95%Conf.Interval] Model .12357363.0399862.2041365 smoke .07693451.0193577.1579213race .09083942.0233037.1700334 30regresspostestimationPostestimationtoolsforregress DenotethepreviouslyestimatedcoefcientvectorbybanditsestimatedvariancematrixbyV.predictworksbyrecallingvariousaspectsofthemodel,suchasb,andcombiningthatinformationwiththedatacurrentlyinmemory.Letxjbethejthobservationcurrentlyinmemory,andlets2bethemeansquarederroroftheregression.Iftheuserspeciedweightsinregress,thenX0XinthefollowingformulasisreplacedbyX0DX,whereDisdenedinCoefcientestimationandANOVAtableunderMethodsandformulasin[R]regress.LetV=s2(X0X)1.Letkbethenumberofindependentvariablesincludingtheintercept,ifany,andletyjbetheobservedvalueofthedependentvariable.Thepredictedvalue(xboption)isdenedasbyj=xjb.Let`jrepresentalowerboundforanobservationjandujrepresentanupperbound.Theprobabilitythatyjjxjwouldbeobservedintheinterval(`j;uj)thepr(`,u)optionisP(`j;uj)=Pr(`jxjb+ejuj)=ujbyj s`jbyj swhereforthepr(`,u),e(`,u),andystar(`,u)options,`jandujcanbeanywhereintherange(1;+1).Theoptione(`,u)computestheexpectedvalueofyjjxjconditionalonyjjxjbeingintheinterval(`j;uj),thatis,whenyjjxjistruncated.ItcanbeexpressedasE(`j;uj)=E(xjb+ejj`jxjb+ejuj)=byjsujbyj s`jbyj s ujbyj s`jbyj swhereisthenormaldensityandisthecumulativenormal.Youcanalsocomputeystar(`;u)theexpectedvalueofyjjxj,whereyjisassumedcensoredat`janduj:yj=8:`jifxjb+ej`jxjb+uif`jxjb+ejujujifxjb+ejujThiscomputationcanbeexpressedinseveralways,butthemostintuitiveformulationinvolvesacombinationofthetwostatisticsjustdened:yj=P(1;`j)`j+P(`j;uj)E(`j;uj)+P(uj;+1)ujAdiagonalelementoftheprojectionmatrix(hat)or(leverage)isgivenbyhj=xj(X0X)1x0jThestandarderroroftheprediction(thestdpoption)isdenedasspj=q xjVx0jandcanalsobewrittenasspj=sp hj.Thestandarderroroftheforecast(stdf)isdenedassfj=sp 1+hj. 32regresspostestimationPostestimationtoolsforregress SpecialÂinterestpostestimationcommandsTheomitted-variabletest(Ramsey1969)reportedbyestatovtesttstheregressionyi=xib+zit+uiandthenperformsastandardFtestoft=0.Thedefaulttestuseszi=(by2i;by3i;by4i).Ifrhsisspecied,zi=(x21i;x31i;x41i;x22i;:::;x4mi).Ineithercase,thevariablesarenormalizedtohaveminimum0andmaximum1beforepowersarecalculated.Thetestforheteroskedasticity(BreuschandPagan1979;CookandWeisberg1983)modelsVar(ei)=2exp(zt),wherezisavariablelistspeciedbytheuser,thelistofright-hand-sidevariables,orthettedvaluesxb.Thetestisoft=0.Mechanically,estathettesttstheaugmentedregressionbe2i=b2=a+zit+vi.TheoriginalBreuschPagan/CookWeisbergversionofthetestassumesthattheeiarenormallydistributedunderthenullhypothesiswhichimpliesthatthescoreteststatisticSisequaltothemodelsumofsquaresfromtheaugmentedregressiondividedby2.Underthenullhypothesis,Shasthe2distributionwithmdegreesoffreedom,wheremisthenumberofcolumnsofz.Koenker(1981)derivedascoretestofthenullhypothesisthatt=0undertheassumptionthattheeiareindependentandidenticallydistributed(i.i.d.).KoenkershowedthatS=NR2hasalarge-sample2distributionwithmdegreesoffreedom,whereNisthenumberofobservationsandR2istheR-squaredintheaugmentedregressionandmisthenumberofcolumnsofz.estathettest,iidproducesthisversionofthetest.Wooldridge(2013)showedthatanFtestoft=0intheaugmentedregressioncanalsobeusedundertheassumptionthattheeiarei.i.d.estathettest,fstatproducesthisversionofthetest.Szroeter'sclassoftestsforhomoskedasticityagainstthealternativethattheresidualvarianceincreasesinsomevariablexisdenedintermsofH=Pni=1h(xi)e2i Pni=1e2iwhereh(x)issomeweightfunctionthatincreasesinx(Szroeter1978).Hisaweightedaverageoftheh(x),withthesquaredresidualsservingasweights.Underhomoskedasticity,Hshouldbeapproximatelyequaltotheunweightedaverageofh(x).LargevaluesofHsuggestthate2itendstobelargewhereh(x)islarge;thatis,thevarianceindeedincreasesinx,whereassmallvaluesofHsuggestthatthevarianceactuallydecreasesinx.estatszroeterusesh(xi)=rank(xiinx1:::xn);seeJudgeetal.[1985,452]fordetails.estatszroeterdisplaysanormalizedversionofH,Q=r 6n n21HwhichisapproximatelyN(0;1)distributedunderthenull(homoskedasticity).estathettestandestatszroeterprovideadjustmentsofp-valuesformultipletesting.Thesupportedmethodsaredescribedin[R]test.estatimtestperformstheinformationmatrixtestfortheregressionmodel,aswellasanorthogonaldecompositionintotestsforheteroskedasticity1,nonnormalskewness2,andnonnormalkurtosis3(CameronandTrivedi1990;LongandTrivedi1993).Thedecompositionisobtainedviathreeauxiliaryregressions.Letebetheregressionresiduals,b2bethemaximumlikelihoodestimateof2intheregression,nbethenumberofobservations,Xbethesetofkvariablesspeciedwithestatimtest,andR2unbetheuncenteredR2fromaregression.1isobtainedasnR2unfromaregressionofe2b2onthecrossproductsofthevariablesinX.2iscomputedasnR2unfromaregressionofe33b2eonX.Finally,3isobtainedasnR2unfromaregressionofe46b2e23b4 regresspostestimationPostestimationtoolsforregress35 Kutner,M.H.,C.J.Nachtsheim,andJ.Neter.2004.AppliedLinearRegressionModels.4thed.NewYork:McGrawHill/Irwin.Lindsey,C.,andS.J.Sheather.2010a.Optimalpowertransformationviainverseresponseplots.StataJournal10:200214. .2010b.Modeltassessmentviamarginalmodelplots.StataJournal10:215225.Long,J.S.,andJ.Freese.2000.sg145:Scalarmeasuresoftforregressionmodels.StataTechnicalBulletin56:3440.ReprintedinStataTechnicalBulletinReprints,vol.10,pp.197205.CollegeStation,TX:StataPress.Long,J.S.,andP.K.Trivedi.1993.Somespecicationtestsforthelinearregressionmodel.SociologicalMethodsandResearch21:161204.ReprintedinTestingStructuralEquationModels,ed.K.A.BollenandJ.S.Long,pp.66110.NewburyPark,CA:Sage.Peracchi,F.2001.Econometrics.Chichester,UK:Wiley.Ramsey,J.B.1969.Testsforspecicationerrorsinclassicallinearleast-squaresregressionanalysis.JournaloftheRoyalStatisticalSociety,SeriesB31:350371.Ramsey,J.B.,andP.Schmidt.1976.SomefurtherresultsontheuseofOLSandBLUSresidualsinspecicationerrortests.JournaloftheAmericanStatisticalAssociation71:389390.Rousseeuw,P.J.,andA.M.Leroy.1987.RobustRegressionandOutlierDetection.NewYork:Wiley.Smithson,M.2001.Correctcondenceintervalsforvariousregressioneffectsizesandparameters:Theimportanceofnoncentraldistributionsincomputingintervals.EducationalandPsychologicalMeasurement61:605632.Szroeter,J.1978.Aclassofparametrictestsforheteroscedasticityinlineareconometricmodels.Econometrica46:13111327.Thompson,B.2006.FoundationsofBehavioralStatistics:AnInsight-BasedApproach.NewYork:GuilfordPress.Velleman,P.F.1986.Comment[onChatterjeeandHadi1986].StatisticalScience1:412413.Velleman,P.F.,andR.E.Welsch.1981.Efcientcomputingofregressiondiagnostics.AmericanStatistician35:234242.Weesie,J.2001.sg161:Analysisoftheturningpointofaquadraticspecication.StataTechnicalBulletin60:1820.ReprintedinStataTechnicalBulletinReprints,vol.10,pp.273277.CollegeStation,TX:StataPress.Weisberg,S.2005.AppliedLinearRegression.3rded.NewYork:Wiley.Welsch,R.E.1982.Inuencefunctionsandregressiondiagnostics.InModernDataAnalysis,ed.R.L.LaunerandA.F.Siegel,149169.NewYork:AcademicPress. .1986.Comment[onChatterjeeandHadi1986].StatisticalScience1:403405.Welsch,R.E.,andE.Kuh.1977.LinearRegressionDiagnostics.TechnicalReport923-77,MassachusettsInstituteofTechnology,Cambridge,MA.White,H.L.,Jr.1980.Aheteroskedasticity-consistentcovariancematrixestimatorandadirecttestforheteroskedasticity.Econometrica48:817838.Wooldridge,J.M.2013.IntroductoryEconometrics:AModernApproach.5thed.Mason,OH:South-Western.Alsosee[R]regressLinearregression[R]regresspostestimationdiagnosticplotsPostestimationplotsforregress[R]regresspostestimationtimeseriesPostestimationtoolsforregresswithtimeseries[U]20Estimationandpostestimationcommands