/
Multiple Imputation in Finite Mixture Modeling Multiple Imputation in Finite Mixture Modeling

Multiple Imputation in Finite Mixture Modeling - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
447 views
Uploaded On 2017-05-06

Multiple Imputation in Finite Mixture Modeling - PPT Presentation

Daniel Lee Presentation for MMM conference May 24 2016 University of Connecticut 1 2 Introduction Finite Mixture Models Class of statistical models that treat group membership as a latent categorical variable ID: 545089

analysis class estimates mixture class analysis mixture estimates data imputation group amp groups models model values parameters finite missing

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Multiple Imputation in Finite Mixture Mo..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Multiple Imputation in Finite Mixture Modeling

Daniel LeePresentation for MMM conference May 24, 2016University of Connecticut

1Slide2

2

Introduction: Finite Mixture Models

Class of statistical models that treat group membership as a latent categorical variable

A class of analysis that estimates parameters for a hypothesized number of groups, or classes, from a single data set (McLachlan & Peel, 2000)

This usually involved:

Investigating population heterogeneity in model parameters

Finding the possible number of latent groups

classifying cases into these groups

examine the extent to which auxiliary information can be used to evaluate classes

Any statistical method that can be formulated as a multiple group problem can be formulated as a finite mixture modelSlide3

3

Introduction: Finite Mixture Models example (factor mixture models)Slide4

4

Introduction: Missing data in finite mixtures

Missing data handling methods in finite mixture models (

Sterba

, 2014

)

Strategy in which

missingness

is handled interferes with discriminating between latent class or latent continuous models.

MVN MI, FIML-EM, and newer MI

approaches considered

MI strategies for multiple group SEMs (Enders &

Gottschall

, 2011)

Explored 2 MI methods with multiple groups

SGI

PTI

Cautionary note on latent categorical

variables (mixture models)Slide5

5

Introduction: Missing Data

Missing data in practice

Listwise

/Pairwise Deletion

Full Information Maximum Likelihood

Multiple Imputation (MI; Rubin, 1976)

Multiple Imputation

Imputation Phase:

generate

m

different datasets, each with slightly different estimates for the missing values.

Analysis Phase:

Analysis performed on the

m

datasets and parameters across

m

results averaged (Special rule for standard errors provided by Rubin (1987))Slide6

6

Introduction: Research Questions

When groups are unknown (mixture models) how will MI perform?

In a recent discussion with Craig Enders

The gist is that standard MI routines will not work for

mixtures

because they will generate imputations from a single

- class

model. In effect, MI leaves out the most important

variable

in the analysis, the latent classes, thereby biasing the

resulting

estimates toward a single, common

class.

..”

In MI the group structure should be accounted for, otherwise imputations will produce poor values (since it uses the entire dataset to get these imputations)

Label switching problem (

Tueller

,

Drotar

, &

Lubke

, 2011)Slide7

7

Methods: Simulation

Manipulated 3 variables (total 12 conditions):

Sample size: 50 and 250

MCAR missing rates: 5%, 15%, 25% (even benign missing values can cause bias)

Mahalanobis

Distances: low ( < 1), medium (1 < D < 2), high ( > 4)

100 multivariate normal complete data sets from a 2-group CFA model with 6 indicator variables.

Each data set contained data for two groups with distinct population parameters, including true group variable (e.g.

n =

250 was split into two groups, 125 in each group, with different population values)Slide8

8

Methods: Data Generating Model

Group 2

Group 1Slide9

9

Methods: Data analysis

Analysis 1: Used MI with 10 imputation when groups were known (normal CFA model), using the SGI procedure. Used built-in

Mplus

imputation (MI in

Mplus

;

Asparouhov

&

Muthen

, 2010) and MG-CFA analysis.

WHAT KIND OF IMPUTATION MODEL IS USED HERE?

Analysis 2: Used MI with 10 imputations when groups were unknown (factor mixture model). Used

Mplus

for imputation and FMM analysis. Starting values: true parametersEstimates from analysis 1 and analysis 2 were compared against true population parameters and standard bias estimates.

Standard error estimates greater than 0.40 considered significant (Collins, Schafer, &

Kam

, 2001). Slide10

10

Label switching

(

Tueller

,

Drotar

, &

Lubke

, 2011

)

Common issue in LVMM simulations

Simple example:

TRUE generating values for factor variances: class 1 = 2 and class 2 = 4.

Rep.1 LVMM estimates show: class 1 = 3.9 and class 2 = 2.1 (switched)

Rep. 2 LVMM estimates show: class 1 = 1.9 and class 2 = 4.1 (OK)Rep. 3 LVMM estimates show: class 1 = 2 and class 2 = 3.7 (switched)

Problem: aggregating parameter estimates over potentially mislabeled classesSlide11

11

Methods: Evaluation criteria

Bias

PUT THE FORMULA HERE

0.05 used as cut-off (

Hoogland

&

Boomsma

, 1998)

RMSE

PUT THE FORMULA HERE

Expected squared loss around the true parameter

Standard error ratio (e.g., Lee, Poon, &

Bentler

, 1995)SE(

theta_hat

(m))/SD(

theta_hat

(m)

values < 1

 inflated Type I error

values > 1  inflated type II error

non-converged replications omitted

Slide12

12

Results: BiasSlide13

13

Results: BiasSlide14

14

Label switching check

(

Tueller

,

Drotar

, &

Lubke

, 2011)Slide15

15

Results: RMSESlide16

16

Results: Standard Error RatioSlide17

17

Discussion and Recommendations (and issues)

MI not recommended for finite mixture models

Other solutions?

Different sample sizes?

Larger differences in parameters?

Label switching?

Does it happen at the imputation level or analysis level?