/
Stata  as a Data Entry Management Tool Stata  as a Data Entry Management Tool

Stata as a Data Entry Management Tool - PowerPoint Presentation

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
374 views
Uploaded On 2018-11-20

Stata as a Data Entry Management Tool - PPT Presentation

Ryan Knight Innovations for Poverty Action Stata Conference 2011 Why Pay Attention to Data Entry It sounds so easy type type type Surveys Data but it is not Excellent Opportunities for ID: 731205

data entry stata discrepancies entry data discrepancies stata lost csv dataset dta type file reconciliation control ids string numeric corrected uniqueid merging

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Stata as a Data Entry Management Tool" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Stata as a Data Entry Management Tool

Ryan Knight

Innovations for Poverty Action

Stata

Conference 2011Slide2

Why Pay Attention to Data Entry?

It sounds so easy…

type, type, type…

Surveys

Data!Slide3

…but it is not!

Excellent Opportunities for

DISASTER

No one checked data quality. Turns out, there’s no unique ID variable. Lost data.

No one monitored data entry contractor. Turns out, they copy + pasted data and changed the IDs. Lost data.RA didn’t know that append forces the string/numeric type of the master file onto the using file and deleted the originals. Lost data.

Records existed in multiple datasets and were different. Data lost in the merging process.

And many more!Slide4
Slide5

Data Entry Quality Control

Use two unique identifiers for every survey

Extensive

testing of data entry interfaceDouble entryDouble entry of first and second entry reconciliationIndependent AuditSlide6

Managing Double Entry

1

st

Entry

2

nd

Entry

Discrepancies

1

st

Reconciliation

2

nd

Reconciliation

Discrepancies

Final Reconciliation

Questionnaire

Final Dataset

Stata

Stata

StataSlide7

Generating a List of Discrepancies

cfout

[varlist] using filename, id(varname) [options]Compares dataset in memory to another dataset and outputs a list of discrepancies

.Can ignore differences in punctuation, spacing and caseSubstantially faster than looping through observationsSlide8

Correcting Discrepancies

March down the output from

cfout

, indicating which value is correctSlide9

Replacing Discrepancies

readreplace

using filename, id(varname)Reads a 3 column .csv file: ID

, question, correct valueAnd makes all of the replacements in your datasetSlide10

The whole process

* Load the data

insheet

using "raw first entry.csv"save "first entry.dta", replaceinsheet using "raw second entry.csv" , clearsave "second entry.dta" , replace

* compare the filescfout region-no_good_at_all using "first entry.dta" , id(

uniqueid)* Make replacements using corrected data

readreplace using "corrected values.csv", id(uniqueid)Slide11

Other Useful Commands

m

ergeall

merges all of the files in a folder, checking for string/numeric differences and duplicate IDs before mergingcfby calculates the number of discrepancies “by” a variable. Useful for calculating error rates.Slide12

Why Use Stata for Reconciliations Instead of Data Entry Software?

Choose the best data

entry best software for each

projectIndependent corrections of discrepancies is more accurate than checks against existing valuesSynergy with physical workflow managementMore control over

mergingReproducibilityAnalyze errors and performance over time