/
Structural equation models and confirmatory factor analysis in small samples: theory and Structural equation models and confirmatory factor analysis in small samples: theory and

Structural equation models and confirmatory factor analysis in small samples: theory and - PowerPoint Presentation

khadtale
khadtale . @khadtale
Follow
363 views
Uploaded On 2020-08-26

Structural equation models and confirmatory factor analysis in small samples: theory and - PPT Presentation

Andrej Srakar PhD Institute for Economic Research Ljubljana and Faculty of Economics University of Ljubljana Slovenia 1 Structure of the presentation Research ID: 802198

countries indicators composite data indicators countries data composite analysis method development indexes amp algorithm quality dynamic sustainable construction literature

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Structural equation models and confirmat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Structural equation models and confirmatory factor analysis in small samples: theory and applications

Andrej Srakar, PhDInstitute for Economic Research, Ljubljana and Faculty of Economics, University of Ljubljana, Slovenia

1

Slide2

Structure of the

presentationResearch problem and literature reviewData and variablesMethodologyResults – factors,

indexes, clusteringResults – dynamic

SEMConclusion and discussion2

Slide3

Main objective

and research questionsIn recent years, many discussions have been led over the construction of composite indicators.Yet, in our best knowledge, there have been very few (if any) attempts to

provide a method of construction in the

presence of limited number of units (e.g. macroeconomic data).Two

main

research

questions

:1) Can we use small sample (high-dimensionality) corrections to construct an index of sustainable development (and is the method applicable to similar indexes as well)?2) Can similar small sample corrections be implemented for validation of the index using CFA (and SEM models on this data in general)?

3

Slide4

Short literature review

Literature on composite indicators:Brancato and Simeoni (2008): capacity of standard quality indicators to reflect quality components and overall quality, using structural equation models. Cecconi, Polidoro and Ricci (2004

): methodological approach to synthesizing basic indicators in order to compare territorial data collection quality, for the Italian consumer price survey. Munda and

Nardo (2006): consistency between the mathematical aggregation rule, used to construct composite indicators and the meaning of weights. Nardo, Saisana, Saltelli, Tarantola, Hoffman and

Giovannini

(2008)

: OECD

handbook i.e. a guide on constructing and using composite indicators, with a focus on composite indicators which compare and rank countries’ performances.

Polidoro

, Ricci and Sgamba (2006): novel methodology that expands on the methods detailed in Cecconi, Polidoro and Ricci (2004).Smith and Weir (2000): how to obtain some overall measure of quality by considering quality as a multivariate measure for any dataset, where each quality indicator represents one dimension of quality.Cherchye and colleagues (2008; 2009): propose developments of composite indicators with imprecise data and using DEA analysis.4

Slide5

Steps in the construction of composite indicators

(OECD, 2008)Theoretical framework. Data selection. Imputation of missing data. Multivariate analysis. An exploratory analysis should investigate the overall structure of the indicators, assess the suitability of the data set and explain the methodological choices, e.g. weighting, aggregation.

Normalisation. Indicators should be normalised to render them comparable. Attention needs to be paid to extreme values as they may influence subsequent steps in the process of building a composite indicator. Skewed data should also be identified and accounted for.

Weighting and aggregation. Indicators should be aggregated and weighted according to the underlying theoretical framework. Correlation and compensability issues among indicators need to considered and either be corrected for or treated as features of the phenomenon that need to retained in the analysis.Robustness and sensitivity. Analysis should be undertaken to assess the robustness of the composite indicator in terms of, e.g., the mechanism for including or excluding single indicators, the normalisation scheme, the imputation of missing data, the choice of weights and the aggregation method.

Back to the real data.

Composite indicators should be transparent and fit to be decomposed into their underlying indicators or values.

Links to other variables.

Attempts should be made to correlate the composite indicator with other published indicators, as well as to identify linkages through regressions.

Presentation and Visualisation.

Composite indicators can be visualised or presented in a number of different ways, which can influence their interpretation.

Slide6

Short literature review –

Dynamic SEMAllison (2014) and Moral-Benito (2013) claim that the dynamic SEM approach has several advantages over both GMM methods and previous ML methods: there is no “incidental parameters” probleminitial conditions are treated as completely exogenous and do not need to be modeledno difficulties arise when the autoregressive parameter is at or near 1.0missing data are easily handled by full-information maximum likelihoodcoefficients can be estimated for time-invariant predictors. (The A-B method cannot do this because it uses difference scores which causes all time-invariant variables to drop out)

many model constraints can be easily relaxed and/or testedIt is well known that likelihood-based approaches (ML) are preferred to method-of-moments (GMM) counterparts in terms of finite-sample performance (see Anderson,

Kunitomo, and Sawa 1982), and that ML is more efficient than GMM under normality. Moral-Benito (2013) compares the widely-used panel GMM estimator of Arellano-Bond (1991) with its likelihood-based counterpart and confirms these results in the case of dynamic panel models with predetermined regressors.

We

use

Dynamic

SEM as implemented in Stata6

Slide7

Data and

method – SDIEurostat Sustainable Development Indicators (SDI) - The indicator framework covers 10 thematic areas belonging to the economic, the social, the environmental, the global and the institutional dimensions. We

include 123 indicators, listed on

next slide.Socioeconomic development;Sustainable consumption and production;Social inclusion;Demographic changes;

Public health;

Climate change and energy;

Sustainable transport;

Natural resources;

Global partnership;

Good governance.7

Slide8

8

Slide9

Method - imputation

First phase: Dealing with missing dataMultiple imputations using Markov chains method (FCS, see

van Buuren et

al. 2006) which allows simultaneous imputation of variables with missing values

The

method

is

dependent

upon ordering of variables, our strategy was therefore to always put „substance“ indicators before the general development indicators in the imputation equations

Slide10

Methodology – ordinary

and MHRM factor analysisSecond phase: regular and rotated factor analysis

upon standardised set of data

: transformation of the variables into quartilesResults of EFA showed that

we

can

speak

about at most 5 factorsMain statistical problem: small sample (much fewer units than variables)Some solutions in the literature: bootstrap correction (Fisher et al. 2014); corrected principal components estimator (Bai 2002) / maximum likelihood estimator (Bai & Li 2012); high-dimensional algorithms (Cai 2010a; Cai 2010b; Asparouhov & Muthén, 2012)We use Metropolis

-Hastings

Robbins

-

Monro

(MHRM)

algorithm

to

adjust

for

the

high

-

dimensionality

of

the

dataset

see

e.g.

Cai

2010a;

Cai

2010b

Related

algorithms

:

Bock

and

Aitkin

’s (1981) EM

algorithm

;

Joint Maximum

Likelihood (JML; see Baker & Kim, 2004)

;

SAEM algorithm

(

see

e.g.

McCullagh

&

Nelder

, 1989)

;

Gu

and

Kong

’s (1998)

stochastic

approximation

Newton–

Raphson

algorithm

;

Monte

Carlo

Newton–

Raphson

algorithm

(MCNR;

McCulloch

& Searle, 2001); Lange’s (1995) gradient

algorithm

;

Titterington

’s (1984)

algorithm

for

incomplete

data

estimation

Finally

:

hierarchical

clustering

,

strenghtened

by

non

-

hierarchical

K-

means

method

Slide11

Methodology –

Metropolis-Hastings Robbins-Monro algorithm

Slide12

Indexes estimation

Calculation of indexes: exploiting the fact that factors are by definition standardised

normal variablesIndex = ((Factor

+ 3) / 6) × 100transform the factors by adding 3 to each value (making them positive in approximately 99.86% cases)dividing their values by 6 (which is the range of the factor in 99.73% cases) multiplying by 100 to get the conventional scales of the index valuesCalculation of the final

index

:

modification

of

Fernando, Samita & Abeynayake (2012) – calculation of PCA and taking the average of first two principal components

Slide13

Results: Explor

. factor analysis vs. MHRM135 general factors:General

development and governance issuesE

mploymentPoverty and social in/exclusionHealth care, environment

and energy consumption

Slide14

Transformation into a

single index14

Slide15

Clustering analysis,

final groupings

Slide16

Clustering analysis,

final groupings(1) Eastern European countries: Bulgaria, Hungary, Poland, Romania, Slovakia,

Croatia(2) Baltic countries

: Estonia, Latvia, Lithuania(3) „Mediterranean“ countries: Greece, Portugal

(4)

Medium

developed

“ Western countries: Belgium, Ireland, Austria(5) Large Western countries: Germany, Spain, France, Italy, United Kingdom(6) Best achievers: Denmark, Luxembourg, Netherlands, Finland, Sweden(7) Outliers: Czech Republic, Cyprus, Malta, Slovenia

Slide17

------------------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

index2 |

index1 |

.4714223

.0403507

11.68 0.000 .3923365 .5505082 lnbdp1 | .1979494 .

0098281

20.14

0.

000

.1786866

.2172122

d1

| .5274382 .3873893

1.36

0.

173

-.2318308

1

.286707

d2

|

1.112265 .192206

0

5.79

0.0

00

.7355486

1

.488982

------------------------------------------------------------------------------

Number of units =

252

. Number of periods = 9.LR test of model vs. saturated: chi2(71) = 110.23, Prob > chi2 = 0.0020RMSEA = 0.065SRMR = 0.071CFI = 0.861TLI = 0.819

A

Dynamic

SEM

Stata

model

Slide18

Discussion of the

main findingsFactor analysis and indexes construction:Clear set of 4-5 dimensions

: economic development

clusters into 2 dimensions (general economic development; and employment); social inclusion

is a

clear

and

separate component; environmental dimension cluster separately and in 1-2 dimensions; good governance does not cluster as a separate dimensionAbsence of a systematic difference between EFA and MHRM scores, furthemore the difference is not largeIn the level of indexes, clearly the countries of Socialdemocracti (Scandinavian)

and

Continental

regime

score

best

while

Eastern

European

score

worst

18

Slide19

Discussion of the

main findingsSeven main clusters: (1) Eastern European countries; (2) Baltic

countries; (3) „Mediterranean“

countries; (4) „Medium developed“ Western countries; (5) Large Western

countries

; (6)

Best

achievers; (7) OutliersThe clusters do not differ significantly whether we take ranks or indexesMost Mediterranean countries (Spain, Italy, France) score closer to the Western countriesLiberal and/or Socialdemocratic countries do not from a clearly separate clusterEastern European countries have a subcluster

of

Baltic

countries

Several

outliers

– Malta,

Cyprus

,

Slovenia

,

Czech

Republic

:

they

score

better

than

Eastern

and

/or

Mediterranean

countries

but

slightly

worse

than the Western regimeThe dynamic SEM model confirms the validity of the construction and the path dependency of the index19

Slide20

Conclusion and

discussionScientific relevance: New index of sustainable development, using sophisticated statistical

methodology.Solution to

the issue of small sample problem, frequently present in similar (macro-data

based

)

indexes

.

Empirical construction of clusters/groups of countries in the level of sustainable development, supporting some of the claims in the literature – has to be developed and tested in future.Limitations and future research: Improvements in statistical methodology, improving particularly the fit of the model (CFA) and distribution of the index

values

(MHRM

might

not

be

the

best

method

due

to

very

limited

sample

,

see

Cai

2010a)

Comparison

with

different

weighting

methods

and

different

sets of used SDI indicatorsValidation with different indexes of sustainable development and on the basis of worldwide dataTesting the methodology also for other indexes – for some applications see Srakar, Verbic & Copic (2015; 2016) and Srakar & Vecco (2016)20

Slide21

Thank

you for listening and comments!andrej.srakar@ier.siandrej_

srakar@t-2.net

21