Embed / Share  Stata and logit recap
Slide1
Stata
and
logit
recapSlide2
Topics
Introduction
to
Stata
Files / directories
Stata
syntax
Useful
commands
/
functions
Logistic
regression
analysis
with
Stata
Estimation
Goodness
Of Fit
Coefficients
Checking
assumptions
Slide3
Overview of
Stata
commands
Note: we did this interactively for the larger part …Slide4
Stata file types
.
ado
programs
that
add
commands
to
Stata
.do
Batch files
that
execute
a set of
Stata
commands
.
dta
Data file in
Stata’s
format
.
log
Output
saved
as
plain
text
by
the
log
using
command
(
you
could
add
.
txt
as well)Slide5
The working directory
The
working
directory is the default directory
for
any
file operations
such
as
using
&
saving
data, or
logging
output
cd
“d:\
my
work
\” Slide6
Saving output to log files
Syntax
for
the log
command
log
using
[
filename
]
,
replace
text
To
close a log file
log
closeSlide7
Using and saving datasets
Load a
Stata
dataset
use
d:\
myproject
\
data.dta
,
clear
Save
save d:\
myproject
\data,
replace
Using
change directory
cd d:\
myproject
use
data,
clear
save data,
replace
Slide8
Entering data
Data in
other
formats
You
can
use
SPSS
to
convert
data
that
can
be
read
with
Stata
.
Unfortunately
,
not
the
other
way
around
(
anymore
)
You
can
use
the
infile
and
insheet
commands
to
import data in ASCII
format
Direct import
and
export of Excel files in
Stata
is
possible
too
Entering
data
by
hand (
don’t
do
this
…)
Type
edit
or
just
click on the dataeditor buttonSlide9
Dofiles
You
can
create
a
text
file
that
contains
a series of
commands
. It is the equivalent of SPSS syntax (but way
easier
to
memorize
)
Use
the do
file editor
to
work
with
dofiles Slide10
Adding
comments
in dofiles
// or *
denote
comments
stata
should
ignore
Stata
ignores
whatever
follows
after
///
and
treats
the next line as a
continuation
Example
IISlide11
A
recommended
template
for
dofiles
capture
log
close
//if a log file is open, close it, otherwise disregard
set
more
off
//
dont'pause
when output scrolls off the page
cd
d:\
myproject
//change directory to your working directory
log
using
myfile
, replace
text
//log results to file
myfile.log
… here you put the rest of your
Stata
commands …
log close
//close the log fileSlide12
Serious data analysis
Ensure replicability use do+log files
Document your dofiles
What is obvious today, is baffling in six months
Keep a research log
Diary that includes a description of every program you run
Develop a system for naming filesSlide13
Serious data analysis
New variables
should
be
given
new
names
Use
variable
labels
and
notes
(I
don’t
like
value
labels
though
)
Double check
every
new
variable
ARCHIVESlide14
Stata
syntax examplesSlide15
Stata
syntax
example
r
egress
y x1 x2
if
x3<20, cluster(x4)
regress
=
command
What
action
do
you
want to
performed
y x1 x2 =
Names
of variables, files
or
other
objects
On
what
things
is the
command
performed
if
x3 <20 =
Qualifier
on
observations
On
which
observations
should
the
command
be
performed
, cluster(x4) = Options
appear
behind
the
comma
What
special
things
should
be
done
in
executing
the
commandSlide16
More
e
xamples
tabulate
smoking race if
agemother
>30
, row
More
elaborate
if
statements:
sum
agemother
if
smoking==1
&
weightmother
<100
Slide17
Elements used for logical statements
Operator
Definition
Example
==
is
equal
in
value
to
if
male == 1
!=
not
equal
in
value
to
if
male !=1
>
greater
than
if
age
> 20
>=
greater
than
or
equal
to
if
age
>=21
<
less
than
if
age
< 66
<=
less
than
or
equal
to
if
age
<=65
&
and
if
age
==21 & male
==1

or
if
age
<=21 
age
>=65Slide18
Missing values
Automatically
excluded
when
Stata
fits
models
(
same
as in SPSS);
they
are
stored
as the
largest
positive
values
Beware!!
The
expression
“
age
>65
”
can
thus
also
include
missing
values
(these are
also
larger
than
65)
To
be
sure
type:
“
age
>65
&
age
!=.”Slide19
Selecting observations
drop
[
variable
list
]
keep
[
variable
list
]
drop
if
age
<65
Note
:
they
are
then
gone
forever
.
This
is
not
SPSS’s
[filter]
command
.Slide20
Creating new variables
Generating new variables
generate
age2 =
age
*
age
(
for
more
complicated
functions
,
there
also
exists
a
command
“
egen
”, as we
will
see
later)Slide21
Useful functions
Function
Definition
Example
+
addition
gen y = a+b

subtraction
gen y =
ab
/
Division
gen
density
=
population
/
area
*
Multiplication
gen y = a*b
^
Take
to a power
gen y = a^3
ln
Natural
log
gen
lnwage
=
ln
(
wage
)
exp
exponential
gen
y =
exp
(b)
sqrt
Square root
gen
agesqrt
=
sqrt
(
age
)Slide22
Replace command
replace
has the
same
syntax as
generate
but is
used
to
change
values
of a
variable
that
already
exists
gen age_dum5
= .
replace
age_dum5
= 0
if
age
< 5
replace
age_dum5 =
1
if
age
>=5Slide23
Recode
Change
values
of
existing
variables
Change 1
to
2
and
3
to
4 in
origvar
,
and
call the new
variable
myvar1:
recode
origvar
(1=2)(3=4), gen(myvar1)
Change
1’s
to
missings
in
origvar
,
and
call the new
variable
myvar2:
recode
origvar
(1=.)
, gen
(myvar2)
