EdS amp David Dueber MA Managing Measurement Error in Regression Analysis in Mplus April 19 2018 Applied Psychometric Strategies Lab Applied Quantitative and Psychometric Series This like any other stories worth telling is all about a girl and her data ID: 744753
Download Presentation The PPT/PDF document "Dani Rosenkrantz , MS," is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Dani Rosenkrantz, MS, EdS & David Dueber, MAManaging Measurement Error in Regression Analysis (in Mplus)April 19, 2018
Applied Psychometric Strategies Lab
Applied Quantitative and Psychometric SeriesSlide2
“This, like any other stories worth telling, is all about a girl” and her data Online survey for Parents of LGBT children with measures of: Cognitive Flexibility (CFIC and CFIA)Emotional Regulation (DERS)Parental Sanctification (MGP)
Religious Fundamentalism (RFS) Parental Acceptance (PA; outcome) N = 470 complete cases
2Slide3
Multiple Regression ModelCFIC
CFIA
DERSRFSMGP
PA
3Slide4
Multiple Regression SyntaxID and item labels
Scale score is the
average item scoreWe use the scale scores in our regression analysisDefining the regression model4Slide5
Multiple Regression OutputStandardized results
5Slide6
ResultsThe multiple regression analyses revealed that the model explained a significant portion of the variance of PA scores, R2 = .182, p < .001CFIC (
), DERS (
), RFS (), and MGP () significantly contributed to the model, but CFIA (
) did not.
6Slide7
Reviewer #2Did you check your assumptions??
7Slide8
The assumptions of multiple regressionPerfect measurement of predictorsNormality of residuals Homoscedasticity of residuals
Other assumptionsNo multicollinearity
Random samplingIndependence of observationsLinearity8Slide9
What is perfect measurement?Perfect Measurement: Observed Score = True Score
Imperfect Measurement:
Observed Score = True Score + Error
Measurement Error
Variance
True Score
Variance
9Slide10
How do we provide evidence that measurement error is zero?
CFIC
CFIADERSRFSMGPPA.81.88.94.93.98.70
10Slide11
Covariance between two variables measured with errorTrue ScoreVariance
Measurement Error
VarianceTrue Score VarianceMeasurement Error
Variance
True Score
Variance
X
Y
11Slide12
Covariance between two variables measured with error12
Cov
(X, Y)MeasurementError
Measurement
Error
Slide13
Approaches to deal with measurement errorCorrect for measurement errorCorrected Correlation Matrix (Problematic)Single Indicator Latent Variables (for low N)Account for measurement errorStructural Equation Modeling (SEM; best!)
13Slide14
What is a Single Indicator Latent Variable (SILV)?X
X_L
D1Measurement Error Variance
X_L is the single indicator latent variable
The observed variable
is the indicator
The disturbance (error) captures
the measurement error variance
Ensures that the variance of X_L
is the reliable variance of X
14Slide15
SLR with Observed Variables(Theoretical)15
X
Yb
Reliable
Variance
Unreliable
Variance
Reliable
Variance
Unreliable
Variance
Not
accounted
for
This coefficient assumes
perfect measurement
Variance
VarianceSlide16
SLR with Observed Variables(Empirical Example)16
X
Yb
Reliable
Variance
Unreliable
Variance
=.21
Reliable
Variance
Unreliable
Variance
Not
accounted
for
This coefficient assumes
perfect measurement
Variance
VarianceSlide17
SLR with SILV17X
Y_L
X_Lb11
D
D
Y
Variance
Variance
This coefficient is
corrected for
measurement errorSlide18
SLR with SILV18X
Y_L
X_Lb11
D
D
Y
0.21
0.19
Variance
0.21
0.19
Variance
This coefficient is
corrected for
measurement errorSlide19
SILV for full regression modelCFIC
CFIA
DERSRFSMGP
PA
PA_L
D
D
D
D
D
D
19
DERS_L
CFIA_L
CFIC_L
RFS_L
MGP_LSlide20
SILV Syntax
Specifying unreliable (residual) variance
Use the latent variables in the regression command
20Slide21
SILV Results (Page 1)
21Slide22
SILV Results (Page 2)
22
AdditionalInformation(ignore)Slide23
Comparing Results23
Coefficient (
)PredictorObserved Model
SILV Model
CFIC
.15**
.22*
CFIA
.07
.07
DERS
.11*
.10
RFS
-.42***
-.55***
MGP
.16*
.25**
R
2.18***
.29***
Predictor
Observed Model
SILV Model
CFIC
.
15**
.
22*
CFIA
.
07
.
07
DERS
.
11*
.10
RFS
-.
42***
-.
55***
MGP
.
16*
.
25**
R
2
.
18***
.
29***Slide24
What about the assumptions about Homoscedasticity and Normality of Residuals?What is a residual?
The deviation between the actual y value and the predicted y value
Residuals are assumed to be normally distributed with constant variance across predicted values (homoscedastic) 24Slide25
What is a plot of residuals supposed to look like?25Slide26
What do our residuals look like? Heteroscedasticity!
26
Heteroscedasticity:Residuals are distributed differently for different predicted values of PASlide27
What do our residuals look like? Censoring!27
Censoring:
Because there is a maximum PA score, participants with a higher true PA score cannot receive their true scoreSlide28
What’s going on here? CensoringAll of these peoplegot the maximumPA Score28Slide29
How does censoring work?Example Context: Scores on a math testMath ability scores normal in the population29Slide30
How does censoring work?The test was too easy! Many people got a perfect score30Slide31
What can we do about a censored outcome variable?Tobit RegressionModels the censored variable as a latent variable with a cutoff to account for censoring31
Data
ModelR2UncensoredSLR.44CensoredSLR.36CensoredTobit.46
Tobit Model
Data
Model
R
2
Uncensored
SLR
.44
Censored
SLR
.36
Data
Model
R
2
Uncensored
SLR
.44Slide32
Tobit + SILV Regression Syntax
No SILV correction
for PADeclare PA asCensored to invoke TobitPA is predicted by the SILVs for other variables 32Slide33
Tobit + SILV Output33Slide34
Comparing Results
Coefficient (
)PredictorObserved ModelSILV Model
Tobit
+
SILV
CFIC
.15**
.22*
.20**
CFIA
.07
.07
.05
DERS
.11*
.10
.08
RFS
-.42***
-.55***-.45***MGP
.16*.25**.20**
R
2.18***.29***.21***
Predictor
Observed Model
SILV Model
Tobit
+
SILV
CFIC
.
15**
.
22*
.
20**
CFIA
.
07
.
07
.
05
DERS
.
11*
.10
.
08
RFS
-.
42***
-.
55***
-.
45***
MGP
.
16*
.
25**
.20**
R
2
.
18***
.
29***
.
21***
34Slide35
Which approach is better?Different assumptionsSILV: HomoscedasticityTobit: Normal distribution in populationPossible to combine approachesFirst would need to determine reliability of PA when uncensored35Slide36
Is there an even better way?Yes, full SEM modeling items as indicators of latent variablesTreat the individual items as categorical Appropriate way to model the item responsesExplains censoring (better than Tobit)SEM accounts for measurement error instead of merely correcting for it
36Slide37
What does an SEM look like?SLR with latent variables37
X_L
Y_LX1X2X3D
D
D
Y3
Y2
Y1
D
D
DSlide38
What does the SEM look like?PA_L
38
CFIC_L
CFIA_L
DERS_L
RFS_L
MGP_LSlide39
Syntax for SEM
All items included
in the modelLikert-type itemsare categorical39Slide40
Output for SEM
40Slide41
Comparing results of all models41
Coefficient (
)
Predictor
Observed Model
SILV Model
Tobit
+
SILV
SEM
CFIC
.15**
.22*
.20**
.17*
CFIA
.07
.07
.05
.04
DERS
.11*
.10.08
.16*RFS-.42***
-.55***-.45***-.70***
MGP.16*.25**.20**.40***
R
2
.18***.29***.21***.32***
Predictor
Observed Model
SILV Model
Tobit
+
SILV
SEM
CFIC
.
15**
.
22*
.
20**
.
17*
CFIA
.
07
.
07
.
05
.
04
DERS
.
11*
.10
.
08
.
16*
RFS
-.
42***
-.
55***
-.
45***
-.
70***
MGP
.
16*
.
25**
.20**
.
40***
R
2
.
18***
.
29***
.
21***
.
32***Slide42
What’s the lesson?What is the takeaway?Measurement error can bias OLS regression parameter estimates and invalidate hypothesis testsModeling techniques exist to estimate and handle measurement error to minimize biasAccounting (or correcting) for measurement error leads to statistical decisions with greater validity
42Slide43
If we can account for it, is measurement error still a problem?Big Picture: Poor measurement is an ethical concern becauseIf the measurement is problematic, the reliability of our findings is compromised
…in other words…Our degree of trust in our results is in question, which means our statistical conclusion validity is in question!
43Slide44