/
TheStataJournalEditorJosephNewtonDepartmentofStatisticsTexasA&MUnivers TheStataJournalEditorJosephNewtonDepartmentofStatisticsTexasA&MUnivers

TheStataJournalEditorJosephNewtonDepartmentofStatisticsTexasA&MUnivers - PDF document

alida-meadow
alida-meadow . @alida-meadow
Follow
364 views
Uploaded On 2016-07-23

TheStataJournalEditorJosephNewtonDepartmentofStatisticsTexasA&MUnivers - PPT Presentation

TheStataJournal20077Number2pp268271Statatip45GettingthosedataintoshapeChristopherFBaumDepartmentofEconomicsBostonCollegeChestnutHillMA02467baumbceduNicholasJCoxDepartmentofGeographyDurhamU ID: 415792

TheStataJournal(2007)7 Number2 pp.268{271Statatip45:GettingthosedataintoshapeChristopherF.BaumDepartmentofEconomicsBostonCollegeChestnutHill MA02467baum@bc.eduNicholasJ.CoxDepartmentofGeographyDurhamU

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "TheStataJournalEditorJosephNewtonDepartm..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

TheStataJournalEditorJosephNewtonDepartmentofStatisticsTexasA&MUniversityCollegeStation,Texas77843979-845-3142;FAX979-845-3144jnewton@stata-journal.comEditorNicholasJ.CoxDepartmentofGeographyDurhamUniversitySouthRoadDurhamCityDH13LEUKn.j.cox@stata-journal.comAssociateEditorsChristopherF.BaumBostonCollegeRinoBellocco TheStataJournal(2007)7,Number2,pp.268{271Statatip45:GettingthosedataintoshapeChristopherF.BaumDepartmentofEconomicsBostonCollegeChestnutHill,MA02467baum@bc.eduNicholasJ.CoxDepartmentofGeographyDurhamUniversityDurhamCity,UKn.j.cox@durham.ac.ukAreyourdatainshape?Thatis,aretheyinthestructurethatyouneedtoconducttheanalysisyouhaveinmind?Datasourcesoftenprovidethedatainastructurethatissuitableforpresentationbutclumsyforstatisticalanalysis.OneofthekeydatamanagementtoolsthatStataprovidesisreshape;see[D]reshape.Ifyouneedtomodifythestructureofyourdata,youshouldbefamiliarwithreshapeanditstwofunctions:reshapewideandreshapelong.Inthistip,wediscusshowtwoapplicationsofreshapemaybethesolutiontosomeknottydatamanagementproblems.Asa rstexample,considerthisquestionpostedonStatalistbyanindividualwhohasadatasetinthewideform:countrytradeflowYr1990Yr1991Armeniaimports105120Armeniaexports90100Boliviaimports200230Boliviaexports80115Colombiaimports100105Colombiaexports7071Hewouldliketoreshapethedataintolongform:countryyearimportsexportsArmenia199010590Armenia1991120100Bolivia199020080Bolivia1991230115Colombia199010070Colombia199110571c\r2007StataCorpLPdm0031 C.F.BaumandN.J.Cox269Wemustexchangetherolesofyearsandtrade\rowsintheoriginaldatatoarriveatthedesiredstructure,suitableforanalysisasxtdata.Thisexchangecanbehandledbytwosuccessiveapplicationsofreshape:.reshapelongYr,i(countrytradeflow)(note:j=19901991)Datawide�-longNumberofobs.6�-12Numberofvariables4�-4jvariable(2values)�-_jxijvariables:Yr1990Yr1991�-YrThistransformationswingsthedataintolongformwitheachobservationidenti edbycountry,tradeflow,andthenewvariablej,takingonthevaluesofyear.Wenowperformreshapewidetomakeimportsandexportsintoseparatevariables:.rename_jyear.reshapewideYr,i(countryyear)j(tradeflow)string(note:j=exportsimports)Datalong�-wideNumberofobs.12�-6Numberofvariables4�-4jvariable(2values)tradeflow�-(dropped)xijvariables:Yr�-YrexportsYrimportsIfwetransformthedatatowideformonceagain,thei()optioncontainscountryandyear,asthosearethedesiredidenti ersoneachobservationofthetargetdataset.Wespecifythattradeflowisthej()variableforreshape,indicatingthatitisastringvariable.Thedatanowhavethedesiredstructure.Althoughwehaveillustratedthisdouble-reshapetransformationwithonlyafewcountries,years,andvariables,thetechniquegeneralizestoanynumberofeach.Asasecondexampleofsuccessiveapplicationsofreshape,considertheWorldBank'sWorldDevelopmentIndicators(WDI)dataset.1Theirextractprogramgen-eratesacomma-separatedvalue(CSV)databaseextract,readablebyExcelorStata,butthestructureofthosedatahindersanalysisaspaneldata.Forarecentyear,theheaderlineoftheCSV leis"Seriescode","CountryCode","CountryName","1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","1973","1974","1975","1976","1977","1978","1979","1980","1981","1982","1983","1984","1985","1986","1987","1988","1989","1990","1991","1992","1993","1994","1995","1996","1997","1998","1999","2000","2001","2002","2003","2004"1.Seehttp://econ.worldbank.org. 270Statatip45Thatis,eachrowoftheCSV lecontainsavariableandcountrycombination,withthecolumnsrepresentingtheelementsofthetimeseries.2Ourtargetdatasetstructureisthatappropriateforpanel-datamodeling,withthevariablesascolumnsandrowslabeledbycountryandyear.Twoapplicationsofreshapewillagainbeneededtoreachthetargetformat.We rstinsheet(see[D]insheet)thedataandtransformthetriliteralcountrycodeintoanumericcodewiththecountrycodesaslabels:.insheetusingwdiex.raw,commanames.encodecountrycode,generate(cc).dropcountrycodeWethenmustaddressthatthetime-seriesvariablesarenamedvar4-var48,astheheaderlineprovidedinvalidStatavariablenames(numericvalues)forthosecolumns.Weuserename(see[D]rename)tochangev4tod1960,v5tod1961,andsoon:forvi=4/48{renamev`i'd`=1956+`i''}Wenowarereadytocarryoutthe rstreshape.Wewanttoidentifytherowsofthereshapeddatasetbybothcountrycode(cc)andseriescode,thevariablename.ThereshapelongwilltransformafragmentoftheWDIdatasetcontainingtwoseriesandfourcountries:.reshapelongd,i(ccseriescode)j(year)(note:j=1960196119621963196419651966196719681969197019711972�197319741975197619771978197919801981198219831984198519861987�198819891990199119921993199419951996199719981999200020012002�20032004)Datawide�-longNumberofobs.7�-315Numberofvariables48�-5jvariable(45values)�-yearxijvariables:d1960d1961...d2004�-d2.Avariationoccasionallyencounteredwillresemblethisstructure,butwithperiodsinreversechronologicalorder.Thesolutionherecanbeusedtodealwiththatproblemaswell. C.F.BaumandN.J.Cox271.listin1/15ccseriesc~eyearcountrynamed1.AFGadjnetsav1960Afghanistan.2.AFGadjnetsav1961Afghanistan.3.AFGadjnetsav1962Afghanistan.4.AFGadjnetsav1963Afghanistan.5.AFGadjnetsav1964Afghanistan.6.AFGadjnetsav1965Afghanistan.7.AFGadjnetsav1966Afghanistan.8.AFGadjnetsav1967Afghanistan.9.AFGadjnetsav1968Afghanistan.10.AFGadjnetsav1969Afghanistan.11.AFGadjnetsav1970Afghanistan-2.9712912.AFGadjnetsav1971Afghanistan-5.5451813.AFGadjnetsav1972Afghanistan-2.4072614.AFGadjnetsav1973Afghanistan-.18828115.AFGadjnetsav1974Afghanistan1.39753Therowsofthedataarenowlabeledbyyear,butoneproblemremains:allvariablesforagivencountryarestackedvertically.Tounstackthevariablesandputtheminshapeforxtreg(see[XT]xtreg),wemustcarryoutasecondreshapethatspreadsthevariablesacrossthecolumns,specifyingccandyearastheivariablesandseriescodeasthejvariable.Sincethatvariablehasstringcontent,weusethestringoption..reshapewided,i(ccyear)j(seriescode)string(note:j=adjnetsavadjsavC02)Datalong�-wideNumberofobs.315�-180Numberofvariables5�-5jvariable(2values)seriescode�-(dropped)xijvariables:d�-dadjnetsavdadjsavC02.ordercccountryname.tssetccyearpanelvariable:cc(stronglybalanced)timevariable:year,1960to2004Afterthistransformation,thedataarenowinshapeforxtmodeling,tabulation,orgraphics.Asillustratedhere,thereshapecommandcantransformeventhemostinconvenientdatastructureintothestructureneededforyourresearch.Itmaytakemorethanoneapplicationofreshapetogettherefromhere,butitcandothejob.

Related Contents


Next Show more