/
Analyzing and visualizing interactions in Analyzing and visualizing interactions in

Analyzing and visualizing interactions in - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
353 views
Uploaded On 2018-11-12

Analyzing and visualizing interactions in - PPT Presentation

sAS 94 Andy Lin IDRE Statistical Consulting Background Regression models effects of IVs on DVs Eg does amount of time exercising predict weight loss Can also model effect of IV modified ID: 728473

prog hours female simple hours prog simple female continuous effects interaction categorical estimate slopes proc slope effect plm run

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Analyzing and visualizing interactions i..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Analyzing and visualizing interactions in sAS 9.4

Andy Lin

IDRE Statistical ConsultingSlide2

BackgroundRegression models effects of IVs on DVs.E.g. does amount of time exercising predict weight loss?Can also model effect of IV modified by another IV

moderating variable (MV)

e.g

is effect of exercise time on weight loss modified by the type of exercise?

Effect modification = interactionSlide3

BackgroundInteractions are products of IVsTypically entered with the IVs into regressionAll we get out of regression is a coefficientNot enough to understand interaction

What are the conditional effects?

Simple effects and slopes

Conditional interactionsSlide4

Purpose of seminarDemonstrate methods to estimate, test and graph effects within an interactionSpecifically we will use PROC PLM to:Calculate and estimate simple effectsCompare simple effects

Graph simple effectsSlide5

Main effects vs interaction modelsMain effects modelsIV effects constrained to be the same across levels of all other IVs in the model

Main effect of height is constrained to be the same across sexes

Average

of male and female

height effect

weight

=

β

0

+

βsSEX+βhHEIGHTSlide6

Main effects vs interaction modelsInteraction modelsAllow effect of an IV to vary with levels of another IVFormed as product of 2 IVs

Now the effect of height may vary between sexes

And effect of sex may vary at different heights

weight

=

β

0

+

β

s

SEX+βhHEIGHTSlide7

Simple effects and slopesFrom this equationWe can derive sex-specific regression equationsMales (sex=0)

Females

(sex=1)

Slide8

Simple effects and slopesEach sex has its own height effectMales (sex=0)Females

(sex=1)

These are the simple slopes of height within each group

Interaction coefficient is difference in simple slopesSlide9

PROC PLMWe use proc plm for most of our analysesProc plm performs post-estimation analyses and graphing

Uses and “item store” as input

Contains model information (coefficients and covariance matrices)

Item store created in other

procs

Inlcuding

glm

,

genmod

, logistic, phreg, mixed, glmmix, and moreSlide10

PROC PLMImportant proc plm statement used in this seminarEstimate statementForms linear combinations of coefficients and tests them against 0

Very flexible – linear combinations can be means, effects, contrasts, etc.

We use it to estimate and compare simple slopes

Syntax is a bit more difficultSlide11

PROC PLMImportant proc plm statement used in this seminarSlice statementSpecifically analyzes simple effects

Very simple syntax

Lsmestimate

statement

Compare estimated marginal means, i.e. calculate simple effects

More versatile than slice Slide12

PROC PLMLsmeans statementEstimates marginal means and can calculate differences between themEffectplot

Plots predicted values of the outcome across range of values on 1 or more predictors

Can visualize interactions

Many types of plotsSlide13

WHY PROC PLMMany of these statements found in regression procsWhy use PROC PLM?Do not have to rerun model as we run code for interaction analysis

These statements sometimes have more functionality in PROC PLMSlide14

Dataset used in seminarStudy of average weekly weight loss achieved by subjects in 3 exercise programs900 subjectsImportant variables:Loss – continuous, normal outcome – average weekly weight loss

Hours – continuous predictor – average weekly hours of exercise

Effort – continuous predictor – average weekly rating of exertion when exercising, ranging from 0 to 50 Slide15

Dataset used in the seminarImportant variables cont:Prog – 3-category predictor - which exercise program the subject followed, 1=jogging, 2=swimming, 3=reading (control)

Female

– binary predictor - gender

, 0=male, 1=female

Satisfied - binary outcome - subject’s overall satisfaction with weight loss due to participation in exercise program, 0=unsatisfied, 1=

satisifiedSlide16

Continuous-by-continuous: the modelWe first model the interaction of 2 continuous IVsThe effect of a continuous IV on the outcome is called a slopeExpresses change in outcome pre unit increase in IVWith the interaction of 2 continuous variables, the slope of each IV is allowed to vary with the other IV

Simple slopesSlide17

Continuous-by-continuous: the modelLet us look at model where Y is predicted by continuous X, continuous Z, and their interaction:Be careful when interpreting

β

x

and

β

z

They are simple effects (when interacting variable=0), not main effectsSlide18

Continuous-by-continuous: the modelThe coefficient βxz

is interpreted as the change in the simple slope of X per unit-increase in Z

Equation for simple slope of X:Slide19

Continuous-by-continuous: example modelWe regress loss on hours, effort, and their interactionIs the effect of hours modified by the effort that the subject exerts?And the converse – is effect of effort modified by hours?Slide20

Continuous-by-continuous: example model

proc

glm

data=exercise;

model loss =

hours|effort

/ solution;

store

contcont

;run;The “|” requests main effects and interactionssolution requests table of regression coefficientsstore contcont creates an item store of the model for proc plmSlide21

Continuous-by-continuous: example modelInteraction is significant

Remember that hours and effort terms are simple slopesSlide22

Continuous-by-continuous: calculating simple slopesEstimate statement used to form linear combinations of regression coefficientsIncluding simple slopes (and effects)Very flexible

Understanding the regression equation very helpful in coding estimate statementsSlide23

Estimate statement syntaxEstimate ‘label’ coefficient values / ee.g. to estimate expected loss when hours=2 and effort = 30

proc

plm

restore=

contcont

;

estimate '

pred

loss, hours=2, effort=30' intercept

1 hours 2 effort 30 hours*effort 60 / e;run;The regression coefficients are multiplied by their values and summed to form the estimate, which is tested against 0Slide24

We see that the values are correctAnd a test against 0 (not interesting here)Slide25

Continuous-by-continuous: calculating simple slopesLet’s revisit the formula for the simple slope of X moderated by ZIn the estimate statement, we will put a 1 after

β

x

and the value of z after

β

zx

In our model, X = hours and Z=effortSlide26

Continuous-by-continuous: calculating simple slopesWhat values of effort to choose to evaluate simple slopes of hoursTwo common choices:Substantively important values (education=12yrs, BMI=18, temperature = 98.6, etc.)

Data-driven values (mean,

mean+sd

, mean-

sd

)

There are no a priori important values of effort, so we choose

(mean,

mean+sd

, mean-

sd) = (26.66, 34.8, 24.52)Slide27

Continuous-by-continuous: calculating simple slopesproc

plm

restore=

contcont

;

estimate

'hours,

effort=mean-

sd

' hours 1 hours*effort 24.52, ‘ hours, effort=mean' hours 1 hours*effort 29.66, 'hours,

effort=

mean+sd

' hours

1

hours*effort

34.8

/ e;

run

;Slide28

Continuous-by-continuous: calculating simple slopesWe might be interested in whether those simple slopes are different, but we don’t need to test itWhy?If the moderator is continuous and interaction is significant then simple slopes will always be different

We demonstrate a difference to show thisSlide29

Continuous-by-continuous: calculating simple slopesTo get the difference between simple slopes, take the difference between values across coefficients in the estimate statement

hours

1

hours*effort

29.66

-

hours

1

hours*effort

24.52

hours 0 hours*effort 5.14Slide30

Continuous-by-continuous: calculating simple slopesCoefficients with 0 values can be omitted:

proc

plm

restore=

contcont

;

estimate 'diff slopes,

mean+sd

- mean' hours*effort

5.14;run;Same t-value and p-value as interaction coefficientSlide31

Continuous-by-continuous: graphing simple slopesWe use effectplot

statement in proc

plm

Plot predicted outcome across range of values of predictors

We will plot across range of 2 predictors to depict an interactionSlide32

Simple slopes as contour plotsproc

plm

source=

contcont

;

effectplot

contour (x=hours y=effort);

run

;Slide33

Simple slopes as contour plots

Contour plots uncommon

Nice that both continuous variables are represented continuously

Simple slopes of hours are horizontal lines across graph

The more the color changes, the steeper the slopeSlide34

Simple slopes as a fit plotproc

plm

source=

contcont

;

effectplot

fit (x=hours) / at(effort=

24.52

29.66 34.8);run;Effort will not be represented continuously, so we must specify values what we wantA separate graph will be plotted for each effortSlide35

Simple slopes as a fit plotMore easily understoodBut why not all 3 on one graph?Slide36

Creating a custom graph through scoringWe can make the graph ourselves by getting predicted loss values across a range of hours at the 3 selected effort values (24.52, 29.66, 34.8) by:Creating a dataset of hours and effort values at which to predict the outcome loss

Use the score statement in proc

plm

to predict the outcome and its 95% confidence interval

Use the scored dataset in proc

sgplot

to create a plot Slide37

Creating a custom graph through scoringdata

scoredata

;

do effort =

24.52

,

29.66

,

34.8

; do hours = 0 to 4 by 0.1; output; end;end;run;

proc

plm

source=

contcont

;

score data=

scoredata

out=

plotdata

predicted=

pred

lclm

=lower

uclm

=upper;

run

;

proc

sgplot

data=

plotdata

;

band x=hours upper=upper lower=lower / group=effort transparency=

0.5

;

series x=hours y=

pred

/ group=effort;

yaxis

label="predicted loss";

run

;Slide38

Creating a custom graph through scoring

Purty

!Slide39

Quadratic effect: the modelSpecial case of a continuous-by-continuous interactionInteraction of IV with itselfAllows the (linear) effect of the IV to vary depending on the level of the IV itselfModels a curvilinear relationship between DV and IV Slide40

Quadratic effect: the modelThe regression equation with linear and quadratic effects of continuous predictor X:β

x

is still interpreted as slope of X when X=0

β

xx

interpretation slightly different

Represents ½ the change in the slope of X when X increase by 1 unitSlide41

Quadratic effect: the modelTo get formula for simple slope of X, we must use partial derivative:Here we see that the slope of X changes by 2β

xx

per unit-increase in XSlide42

Quadratic effect: example modelWe regress loss on the linear and quadratic effect of hours

proc

glm

data=exercise order=internal;

model loss =

hours|hours

/ solution;

store quad;

run

;Slide43

Quadratic effect: example modelQuadratic effect is significantNegative sign indicates that slope becomes more negative as hours increases (inverted U-shaped curve)

Diminishing returns on increasing hoursSlide44

Quadratic effect: calculating simple slopesWe construct estimate statements for simple slopes in the same way as beforeBUT, we must be careful to multiply the value after the quadratic effect by 2We will put a 1 after

β

x

and the value of

2*x

after

β

xx

N

o a priori important values of hours, so we choose mean=2,

mean+sd=2.5, and mean-sd=1.5 Slide45

Quadratic effect: calculating simple slopesproc

plm

restore=quad;

estimate

'hours,

hours=mean-

sd

(1.5)' hours

1

hours*hours 3, 'hours, hours=mean(2)' hours 1 hours*hours 4, 'hours, hours=mean+sd

(2.5)' hours

1

hours*hours

5

/ e;

run

;

Slopes decrease as hours increase, eventually non-significantSlide46

Quadratic effect: comparing simple slopesDo not need to compareSignificance always same as interaction coefficientSlide47

Quadratic effect: graphing the quadratic effectThe “fit” type of effectplot is made for plotting the outcome vs a single continuous predictor

proc

plm

restore=quad;

effectplot

fit (x=hours);

run

;Slide48

Quadratic effect: graphing the quadratic effect

Diminishing returns apparent

Too many hours of exercise may lead to weight gainSlide49

Continuous-by-categorical: the modelWe can also estimate the simple slopes in a continuous-by-categorical interactionWe will estimate the slope of the continuous variable within each category of the categorical variableWe could also look at the simple effects of the categorical variable across levels of the continuous

First, how do categorical variables enter regression models?Slide50

Categorical predictors and dummy variablesA categorical predictor with k categories can be represented by k dummy variables

Each dummy codes for membership to a category, where 0=non-membership and 1=membership

However, typically only

k

-1 dummies are entered into the regression model?

Each dummy is a linear combination of all other dummies --

collinearity

Regression model cannot estimate coefficient for a collinear predictor Slide51

Categorical predictors and dummy variablesOmitted category known as the reference categoryAll effects of a categorical variable in the regression table are comparisons with reference groupSAS by default will use the last category as the referenceSlide52

Categorical predictors and dummy variablesSlide53

Interaction of dummy variables and continuous variableTo interact the dummy variables with a continuous predictor, multiply each one by the continuous variableAny interaction involving an omitted dummy will be omitted as well Slide54

Continuous-by-categorical: the modelHere is the regression equation for a continuous variable, X, interacted with a 3-category categorical predictor, M

β

x

is simple slope of X for M=3

β

m1

and

β

m2

are simple effects of M when X=0βxm1 and βxm2 represent differences in slopes of X when M=1 and M=2, and differences in simple effects of M per unit change in XSlide55

Continuous-by-categorical: the modelFormulas for simple slopesSlide56

Continuous-by-categorical: example modelWe regress loss on hours, prog (3-category) and their interaction

proc

glm

data=exercise order=internal;

class

prog

;

model loss =

hours|prog

/ solution;store catcont;run;Put prog on class statement to declare it categoricalUse order=internal to order prog by numeric value rather than formatsSlide57

Continuous-by-categorical: example model

Notice the 0 coefficients for reference groups

Interaction is significant overallSlide58

Continuous-by-categorical: calculating simple slopesHere are the formulas for our simple slopes again:

SAS will accept the first two formulas for estimates of the simple slopes in estimate statements

But the estimate statement for the slope of X when (M=3) REQUIRES the inclusion of the coefficient for the interaction X and (M=3), even though it is constrained to 0

We don’t normally need to calculate the slope in the reference group, nor compare to other slopes, so not usually a huge problemSlide59

Continuous-by-categorical: calculating simple slopes

proc

plm

restore =

catcont

;

estimate 'hours slope,

prog

=1 jogging' hours

1 hours*prog 1 0 0, 'hours slope,

prog

=2 swimming' hours

1

hours*

prog

0

1

0

,

'hours

slope,

prog

=3 reading' hours

1

hours*

prog

0

0

1

/ e;

run

;

Notice the inclusion of the zero coefficient in the estimate of the slope when M=3Slide60

Continuous-by-categorical: calculating simple slopesIncreasing hours increases weight loss in jogging and swimming, lessens loss in reading programNotice that last estimate appears in regression table as hours coefficientSlide61

Potential pitfallIf calculating a simple slope or effect, do not omit interaction coefficients Otherwise, SAS will average over those coefficientsLet’s pretend we forgot to include the 0 interaction coefficient in the estimation of the hours slope when M=3

proc

plm

restore =

catcont

;

estimate 'hours slope,

prog

=3 reading (wrong)' hours

1 / e;run;Slide62

Potential pitfallThe e option gives us the estimate coefficientsSAS applied values of .333 to all 3 interaction coefficients, averaging their effectsSlide63

Continuous-by-categorical: calculating simple slopesWe again take differences in values across coefficients to test differences in simple slopes:

hours

1 hours*

prog

1

0

0

-hours

1 hours*

prog 0 1 0 hours 0 hours*prog 1 -1 0Slide64

Continuous-by-categorical: calculating simple slopesproc

plm

restore =

catcont

;

estimate 'diff slopes,

prog

=1 vs

prog

=2' hours*prog -1 1 0, 'diff slopes, prog=1 vs

prog

=3' hours*

prog

-

1

0

1

,

'diff slopes,

prog

=2 vs

prog

=3' hours*

prog

0

-

1

1

/ e;

run

;

Slopes in

prog

=1 and

prog

=2 do not differ

Other 2 comparisons are regression coefficientsSlide65

Continuous-by-categorical: graphing slopesThe slicefit type of effectplot plots the outcome against a continuous predictor on the X-axis, with separate lines by a categorical predictor (typically, but can be continuous)

proc

plm

source=

catcon

;

effectplot

slicefit

(x=hours sliceby=prog) / clm;run;The option clm adds confidence limitsSlide66

Continuous-by-categorical: graphing slopes

Easy to see direction of effects, and that slopes in jogging and reading do not differSlide67

Categorical-by-categorical: the modelThe interaction of a categorical variables X with 2 categories and M with 3 produces 6 interaction dummiesAny interaction dummy formed by a omitted dummy will be omitted as well

4 of the 6 will be omitted because of

collinearitySlide68

Categorical-by-categorical: the modelSlide69

Categorical-by-categorical: the modelSlide70

Categorical-by-categorical: the modelRegression equation modeling the interaction of X and M

β

x

is simple

effect of X (X=0 vs X=1)

for M=3

β

m1

and

βm2 are simple effects of M when X=1βx0m1 and βx0m2 represent differences in effects of X when M=1 and M=2, or differences in effects of M when X=0Slide71

Think of simple effects as differences in expected meansSimple effects represent differences between the mean outcome of 2 groups that belong to different categories on one predictorFor instance, the simple effect of X when M=1 is the difference between the mean outcome when X=0,M=1 and the mean outcome when X=1,M=1Slide72

Simple effects expressed as differences in meansSlide73

Categorical-by-categorical: example model

proc

glm

data=exercise order=internal;

class

female

prog

;

model loss

= female|prog / solution;store catcat;run;Slide74

Categorical-by-categorical: example model

Interaction is overall significant

Lots of omitted coefficientsSlide75

Categorical-by-categorical: estimating simple effects with the slice statementSlice statement designed for simple effect estimationSyntax:slice interaction_effect

/

sliceby

=

diff

interaction_effect

is interaction to be decomposed

Sliceby

= specifies variable at whose distinct levels the simple effects of the other variable will be estimated

Diff produces numerical estimates of the simple effect, instead of just a test of significance (default)Slide76

Categorical-by-categorical: estimating simple effects with the slice statementproc

plm

restore =

catcat

;

slice

female*prog

/

sliceby

=prog diff adj=bon plots=none nof e means;slice female*prog / sliceby=female diff adj=bon plots=none

nof

e means;

run

;

Estimates both sets of simple effects

Bonferroni

adjustment due to multiple comparisons (

adj

=bon)

No plotting (hard to interpret and slow)

We suppress the somewhat redundant F-test “

nof

T

he means of each cell will be output with “means”Slide77

Categorical-by-categorical: estimating simple effects with the slice statement

All simple effects are significant

except males vs females

in reading program

So genders differ in other 2 programs

And programs differ within each genderSlide78

Estimating simple effects with the lsmestimate statementThe lsmestimate statement combines

lsmeans

and estimate statements

Used to estimate linear combinations of estimated (marginal) means

From a balanced population

Simple effects can be estimated through linear combinations of marginal meansSlide79

Estimating simple effects with the lsmestimate statementSyntax:lsmestimate effect [value,

level_x

level_m

]...

Effect is effect made up of only categorical predictors

Value is value to apply to mean in linear combination

l

evel_x

and

level_m are the ORDINAL levels of the categorical predictors defining target meanFor X=0 and X=1, specify 1 for X=0 and 2 for X=1Slide80

Estimating simple effects with the lsmestimate statementproc

plm

restore=

catcat

;

lsmestimate

female*

prog

'male-female, prog = jogging(1)' [1, 1 1] [-1, 2 1

],

'male-female,

prog

= swimming(2)' [

1

,

1

2

] [-

1

,

2

2

],

'male-female,

prog

= reading(3)' [

1

,

1

3

] [-

1

,

2

3

],

'jogging-reading

,

female

=

male(0

)' [

1

,

1

1

] [-

1

,

1

3

],

'jogging-reading

,

female

=

female(1

)' [

1

,

2

1

] [-

1

,

2

3

],

'swimming-reading

,

female

=

male(0

)' [

1

,

1

2

] [-

1

,

1

3

],

'swimming-reading

,

female

=

female(1

)' [

1

,

2

2

] [-

1

,

2

3

],

'jogging-swimming

,

female

=

male(0

)' [

1

,

1

1

] [-

1

,

1

2

],

'jogging-swimming

,

female

=

female(1

)' [

1

,

2

1

] [-

1

,

2

2

] / e

adj

=bon;

run

;Slide81

Estimating simple effects with the lsmestimate statement

Same estimates as slice statementSlide82

Comparing simple effects with the lsmestimate statementOnly the lsmestimate and not the slice statement can compare simple effects

To compare, place 2 simple effects on same row and reverse values for 1

[1, 1 1] [-1, 2 1]

-[1, 1 2] [-1, 2 2]

[

1, 1 1] [-1, 2 1

] [-1

, 1 2]

[1

, 2 2]Slide83

Comparing simple effects with the lsmestimate statementproc

plm

restore=

catcat

;

lsmestimate

prog

*female 'diff m-f, jog-swim’ [

1, 1 1] [-1, 2 1] [-1

,

1

2

] [

1

,

2

2

],

'diff m-f, jog-read' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

3

] [

1

,

2

3

],

'diff m-f, swim-read' [

1

,

1

2

] [-

1

,

2

2

] [-

1

,

1

3

] [

1

,

2

3

],

'diff jog-read, m - f' [

1

,

1

1

] [-

1

,

1

3

] [-

1

,

2

1

] [

1

,

2

3

],

'diff swim-read, m - f' [

1

,

1

2

] [-

1

,

1

3

] [-

1

,

2

2

] [

1

,

2

3

],

'diff jog-swim, m - f' [

1

,

1

1

] [-

1

,

1

2

] [-

1

,

2

1

] [

1

,

2

2

]/ e

adj

=bon;

run

;Slide84

Comparing simple effects with the lsmestimate statement

All differences are significant – although only one we already didn’t knowSlide85

Categorical-by-categorical: graphing simple effectsThe interaction type effectplot is used to plot the outcome vs two categorical predictors.

The connect option is used to connect the points

proc

plm

restore=

catcat

;

effectplot

interaction (x=female

sliceby=prog) / clm connect;effectplot interaction (x=prog sliceby=female) / clm connect;

run

;Slide86

Graph of simple gender effects

No effect of gender in the reading programSlide87

Graph of simple program effects

Effect of program seems stronger from femalesSlide88

3-way interactions: categorical-by-categorical-by-continuousInteraction of 3 predictors can be decomposed in many more ways than the interaction of 2.Imagine we interact 2-category X with 3-category M and continuous ZHow can we decompose this interaction?Slide89

3-way interactions, categorical-by-categorical-by-continuous: the modelWe can estimate the conditional interaction of X and Z across levels of MDo X and Z interact at each level of M?Are the X and Z interactions different across levels of M?

We can further decompose the conditional interactions of X and Z

What are the simple slopes of Z across X and simple effects of X across Z?Slide90

3-way interactions, categorical-by-categorical-by-continuous: the modelWe could then look at interaction of X and M across levels of ZDo X and M interact at various values of Z?Are these interactions different?

Within each conditional interaction of X and M, what are the simple effects of X and M?

We can also look at the interaction of M and Z across XSlide91

3-way interactions, categorical-by-categorical-by-continuous: the modelRegression equation can be intimidating

Single variable coefficients are still simple effects and slopes (but now for 2 reference levels each)

2-way interaction coefficients are conditional interactions (at reference level of 3

rd

variable)Slide92

3-way interactions: examplemodelWe regress loss on female (2-category),

prog

(3-category) and hours (continuous)

proc

glm

data = exercise order=internal;

class female

prog

;model loss = female|prog|hours / solution;store catcatcon;run;Slide93

3-way interactions: examplemodel

3-way interaction is significantSlide94

3-way interactions: examplemodel

Not very easy to interpret!Slide95

3-way interactions: simple slope focused analysisImagine our focus is estimating which groups benefit the most from increasing the weekly number of hours of exercise.This analysis is focused on the simple slopes of hoursWe approach this section by addressing questions the researcher might ask, starting with the lowest level and building upSlide96

What are the simple slopes of Z across levels of X and M?There are a total of 6 groups made up by X and M, and we can estimate the slope of hours in eachWe use estimate statements againPlace a 1 after the coefficient for the slope variable by itself (e.g. hours)

Place a 1 after each 2-way interaction coefficient involving the slope variable and either of the 2 factor groups (e.g. hours*(female=0) and hours*(

prog

=1))

Place a 1 after the 3-way interaction

coefficient

involving the slope variable and both of the factor groups (e.g. hours*(female=0,prog=1))Slide97

3-way interaction: estimating simple slopes using estimate statementproc

plm

restore=

catcatcon

;

estimate 'hours

slope, male

prog

=jogging' hours

1 hours*female 1 0 hours*prog 1 0 0 hours*female*

prog

1

0

0

0

0

0

,

'hours

slope, male

prog

=swimming' hours

1

hours*female

1

0

hours*

prog

0

1

0

hours*female*

prog

0

1

0

0

0

0

,

'hours

slope, male

prog

=reading' hours

1

hours*female

1

0

hours*

prog

0

0

1

hours*female*

prog

0

0

1

0

0

0

,

'hours

slope, female

prog

=jogging' hours

1

hours*female

0

1

hours*

prog

1

0

0

hours*female*

prog

0

0

0

1

0

0

,

'hours

slope, female

prog

=swimming' hours

1

hours*female

0

1

hours*

prog

0

1

0

hours*female*

prog

0

0

0

0

1

0

,

'hours slope, female

prog

=reading' hours

1

hours*female

0

1

hours*

prog

0

0

1

hours*female*

prog

0

0

0

0

0

1

/ e

adj

=bon;

run

;Slide98

3-way interaction: estimating simple slopes using estimate statement

Increasing number of weekly hours of exercise significantly increases weight loss in all groups except those in the reading program, where it decreases weight loss (not significantly for females in the reading program after

Bonferroni

adjustment)Slide99

Are the (X*Z) conditional interactions significant?We can now compare the simple slopes hours of between genders within each programThis is a test of whether hours and gender interact within each programAs always, we test differences in effects by subtracting values across coefficientsSlide100

Are the (X*Z) conditional interactions significant?proc

plm

restore=

catcatcon

;

estimate 'diff hours slope, male-female

prog

=1' hours*female

1

-1 hours*female*prog 1 0 0 -1 0

0

,

'diff

hours slope, male-female

prog

=2' hours*female

1

-

1

hours*female*

prog

0

1

0

0

-

1

0

,

'diff

hours slope, male-female

prog

=3' hours*female

1

-

1

hours*female*

prog

0

0

1

0

0

-

1

/ e

adj

=bon;

run

;

Males and female benefit differently from increasing the number of hours in jogging

and reading programs.

One of these interactions appears in the regression table? Which one?Slide101

Are the (X*Z) conditional interactions different?We can test if the conditional interactions are different from one anotherDo the way males and females benefit differently by increasing hours of exercise VARY between programs?

Take differences between conditional interactions

Notice only the 3-way interaction coefficient is leftSlide102

Are the (X*Z) conditional interactions different?proc

plm

restore=

catcatcon

;

estimate 'diff

diff

hours slope, male-female

prog

=1-prog=2' hours*female*prog 1 -1 0 -1 1 0,

'diff

diff

hours slope, male-female

prog

=1-prog=3' hours*female*

prog

1

0

-

1

-

1

0

1

,

'diff

diff

hours slope, male-female

prog

=2-prog=3' hours*female*

prog

0

1

-

1

0

-

1

1

/ e;

run

;

All of the comparisons are significant. The differential benefit from increasing exercise hours between genders differs between all 3 programs.Slide103

3-way interaction: graphing simple slopesWe need to partition our graphs by a third-variable now.We can use the plotby= option, to plot separate graphs across levels of a variable

proc

plm

restore=

catcatcon

;

effectplot

slicefit

(x=hours sliceby=female plotby=prog) / clm;run;Slide104

3-way interaction: graphing simple slopes

Easy to see slopes, differences between slopes, and interactionsSlide105

3-way interaction, simple effects focused analysisImagine instead we are more interested in gender differences across programs and at different hours of weekly exercise?Similar questions can be posedSlide106

What are the simple effects of X across M and Z?We use lsmestimate statements to estimate simple effects of female at each level of prog at the mean, mean-

sd

and

mean+sd

of hours

The “at” option allows us to specify hours

For this question we could use slice or

lsmestimateSlide107

Estimating the simple effects of X across M and Z using lsmestimate

proc

plm

restore=

catcatcon

;

lsmestimate

female*

prog

'male-female, prog=jogging(1) hours=1.51' [1, 1 1] [-1, 2 1

],

'male-female

,

prog

=swimming(2) hours=1.51' [

1

,

1

2

] [-

1

,

2

2

],

'male-female

,

prog

=reading(3) hours=1.51' [

1

,

1

3

] [-

1

,

2

3

] / e

adj

=bon at hours=

1.51

;

lsmestimate

female*

prog

'male-female,

prog

=jogging(1) hours=2' [

1

,

1

1

] [-

1

,

2

1

],

'male-female

,

prog

=swimming(2) hours=2' [

1

,

1

2

] [-

1

,

2

2

],

'male-female

,

prog

=reading(3) hours=2' [

1

,

1

3

] [-

1

,

2

3

] / e

adj

=bon at hours=

2

;

lsmestimate

female*

prog

'male-female,

prog

=jogging(1) hours=2.5' [

1

,

1

1

] [-

1

,

2

1

],

'male-female

,

prog

=swimming(2) hours=2.5' [

1

,

1

2

] [-

1

,

2

2

],

'male-female

,

prog

=reading(3) hours=2.5' [

1

,

1

3

] [-

1

,

2

3

] / e

adj

=bon at hours=

2.5

;

run

;Slide108

Estimating the simple effects of X across M and Z using lsmestimate Slide109

Are the conditional interactions significant?The overall test of each conditional interaction of female and program (at a fixed number of hours) involves tests of 2 coefficients (which are differences in simple effects), so must be tested with a joint F-testThe “joint” option on lsmestimate

performs a joint F-test Slide110

Are the conditional interactions significant?proc

plm

restore=

catcatcon

;

lsmestimate

female*

prog

'diff male-female,

prog=1 - prog=2, hours=1.51' [1, 1 1] [-1, 2 1

] [-

1

,

1

2

] [

1

,

2

2

],

'diff

male-female,

prog

=1 -

prog

=3, hours=1.51' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

3

] [

1

,

2

3

],

'diff

male-female,

prog

=2 -

prog

=3, hours=1.51' [

1

,

1

2

] [-

1

,

2

2

] [-

1

,

1

3

] [

1

,

2

3

] / e at hours=

1.51

joint;

lsmestimate

female*

prog

'diff male-female,

prog

=1 -

prog

=2, hours=2' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

2

] [

1

,

2

2

],

'diff

male-female,

prog

=1 -

prog

=3, hours=2' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

3

] [

1

,

2

3

],

'diff

male-female,

prog

=2 -

prog

=3, hours=2' [

1

,

1

2

] [-

1

,

2

2

] [-

1

,

1

3

] [

1

,

2

3

] / e at hours=

2

joint;

lsmestimate

female*

prog

'diff male-female,

prog

=1 -

prog

=2, hours=2.5' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

2

] [

1

,

2

2

],

'diff

male-female,

prog

=1 -

prog

=3, hours=2.5' [

1

,

1

1

] [-

1

,

2

1

] [-

1

,

1

3

] [

1

,

2

3

],

diff

male-female,

prog

=2 -

prog

=3, hours=2.5' [

1

,

1

2

] [-

1

,

2

2

] [-

1

,

1

3

] [

1

,

2

3

] / e at hours=

2.5

joint;

run

;Slide111

Are the conditional interactions significant?

Female and

prog

significantly interact at hours = 1.5, 2 and 2.5Slide112

3-way interaction: graphing simple effectsWe add the plotby= option to an interaction plot

proc

plm

restore=

catcatcon

;

effectplot

interaction (x=female

sliceby

=prog) / at(hours = 1.51 2 2.5) clm connect; run;Slide113

3-way interaction: graphing simple effects

Interaction more pronounced at lower numbers of hoursSlide114

Logistic RegressionBinary (0/1) outcomeOften defined as success and failureModels how predictors affect probability of the outcomeProbability, p, is transformed to logit in logistic regressionSlide115

Logit transformationLogit transforms probability to log-odds metricCan take on any value (instead of restricted to 0 through 1)Slide116

Logistic regression Logit of p (not p itself) is modeled as having a linear relationship with predictorsSlide117

Non-linear relationship between p and predictorsImagine simple logit model where estimate the log odds of p when X=0 and X=1:The difference between log odds estimate is:Remembering our logarithmic identity and the definition of odds:Slide118

Non-linear relationship between p and predictorsWe substitute and get:Which we then exponentiate

:Slide119

Odds ratiosExponentiated logistic regression coefficients are interpreted as odds ratios (ORs)By what factor is the odds changed per unit increase in the predictor

Or, what is the percent change in the odds

per unit increase in the predictor

Odds ratios are constant across the range of the predictor

Differences in probabilities are not

But ORs can be misleading without knowing the underlying probabilitiesSlide120

Logistic regression, categorical-by-continuous interaction: example modelWe model how the odds (probability) of satisfaction is predicted by hours of exercise, program and their interactionWe can create an item store in proc logistic for proc

plmSlide121

Logistic regression, categorical-by-continuous interaction: example modelproc

logistic

data = exercise descending;

class

prog

/

param

=

glm

order=internal;model satisfied = prog|hours / expb;store logit;run;descending tells SAS to model probability of 1 instead of 0, the defaultparam=glm ensures we use dummy coding (rather than effect coding, the default)expb

exponentiates

the regression

coeffients

– although not all are interpreted as odds ratiosSlide122

Logistic regression, categorical-by-continuous interaction: example model

Interaction is significantSlide123

Logistic regression cat-by-cont, calculating and graphing simple ORsThe simple slope of hours in each program yields an odds ratio when exponentiated

We use the

oddsratio

statement within proc logistic to estimate these simple odds ratios

A nice odds ratio plot is produced by defaultSlide124

Logistic regression cat-by-cont, calculating and graphing simple ORs

proc

logistic

data = exercise descending;

class

prog

/

param

=

glm order=internal;model satisfied = prog|hours / expb;oddsratio hours / at(prog=all);store logit;run;

The at(

prog

=all) option requests that

oddsratio

for hours be calculated at each level of

progSlide125

Logistic regression cat-by-cont, calculating and graphing simple ORs

Increasing weekly hours of exercise increases odds of satisfaction in jogging and swimming groupsSlide126

Simple odds ratios can be compared in estimate statementsThis code produces the simple odds ratios in an estimate statement

proc

plm

restore=logit;

estimate 'hours OR,

prog

=1' hours

1

hours*

prog 1 0 0, 'hours OR, prog=2' hours 1

hours*

prog

0

1

0

,

'hours

OR,

prog

=3' hours

1

hours*

prog

0

0

1

/ e

exp

cl;

run;

This code compares them

proc

plm

restore=logit;

estimate 'ratio hours OR,

prog

=1/

prog

=2' hours*

prog

1

-

1

0

,

'ratio

hours OR,

prog

=1/

prog

=3' hours*

prog

1

0

-

1

,

'ratio

hours OR,

prog

=2 /

prog

=3' hours*

prog

0

1

-

1

/ e

exp

cl;

run

;Slide127

Simple odds ratios can be compared in estimate statements

The

exponentiated

differences between simple slopes (

exponentiated

interaction coefficient) yields a ratio of odds ratios

ORjog

/

ORswim

= Ratio of ORs

4.109/5.079 = .809Slide128

Predicted probabilitiesOdds ratios summarize the effects of predictors in 1 number, but can be misleading because we don’t know the underlying probabilitiesE.g. OR for p=.001 and p=.003 is the same OR for p=.25 and p=.5Good idea to get sense of probabilities of outcome across groupsSlide129

The lsmeans statement for predicted probabilitiesThe lsmeans statement is used to estimate marginal means

The

ilink

option allows transformation back the original response metric (here probabilities)

The at option allows specification of continuous covariates for estimation of meansSlide130

The lsmeans statement for predicted probabilitiesproc

plm

source = logit;

lsmeans

prog

/ at hours=

1.51

ilink plots=none;lsmeans prog / at hours=2 ilink plots=none;lsmeans

prog

/ at hours=

2.5

ilink

plots=none;

run

;Slide131

The lsmeans statement for predicted probabilities

Predicted probabilities are the column “Mean”Slide132

Graphs of predicted probabilitiesThe effectplot statement by default plots the outcome in its original metricWe can get an idea of the simple effects and simple slopes in the probability metric with 2

effectplot

statements

proc

plm

restore=logit;

effectplot

interaction (x=

prog

) / at(hours = 1.51 2 2.5) clm;effectplot slicefit (x=hours

sliceby

=

prog

) /

clm

;

run

; Slide133

Graphs of predicted probabilitiesSlide134

Graphs of predicted probabilitiesSlide135

Concluding guidelinesGuidelines for using estimate statement to estimate simple slopesAlways put a 1 after the coefficient for slope variableIf interacted with continuous IV (not quadratic), put value of continuous IV after interaction coefficient

If interacted with categorical, put a 1 after relevant interaction dummy

If interacted with 3 way, make sure to include:

the coefficient alone

both 2-way coefficients involving slope and either interactor

The 3-way coefficient involving all interactors

Follow the second rule above if interaction involves continuous (unless both are continuous, in which case apply the product of the 2 continuous interactors)

Follow the third rule if the interaction involves only dummy variables

To estimate differences, subtract values across coefficients

Use “e” to check values and coefficients

Use “joint” to perform a joint F-testUse adj= to correct for multiple comparisonsUse exp to exponentiate estimates (for logistic and other non-linear models)Slide136

Concluding guidelinesGuidelines for using lsmestimate statement to estimate simple effectsThink of simple effects as differences between means

Assign one mean the value 1 and the other -1

Remember to use ordinal values for categorical predictors, not the actual numeric values

To compare simple effects, put two effects on the same row and reverse the values for one of them

Use joint for joint F-tests

Use

adj

= for multiple comparisonsSlide137

It’s over!Thank you for attending!