Andrej Srakar PhD Institute for Economic Research Ljubljana and Faculty of Economics University of Ljubljana Slovenia 1 Structure of the presentation Research ID: 802198
Download The PPT/PDF document "Structural equation models and confirmat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Structural equation models and confirmatory factor analysis in small samples: theory and applications
Andrej Srakar, PhDInstitute for Economic Research, Ljubljana and Faculty of Economics, University of Ljubljana, Slovenia
1
Slide2Structure of the
presentationResearch problem and literature reviewData and variablesMethodologyResults – factors,
indexes, clusteringResults – dynamic
SEMConclusion and discussion2
Slide3Main objective
and research questionsIn recent years, many discussions have been led over the construction of composite indicators.Yet, in our best knowledge, there have been very few (if any) attempts to
provide a method of construction in the
presence of limited number of units (e.g. macroeconomic data).Two
main
research
questions
:1) Can we use small sample (high-dimensionality) corrections to construct an index of sustainable development (and is the method applicable to similar indexes as well)?2) Can similar small sample corrections be implemented for validation of the index using CFA (and SEM models on this data in general)?
3
Slide4Short literature review
Literature on composite indicators:Brancato and Simeoni (2008): capacity of standard quality indicators to reflect quality components and overall quality, using structural equation models. Cecconi, Polidoro and Ricci (2004
): methodological approach to synthesizing basic indicators in order to compare territorial data collection quality, for the Italian consumer price survey. Munda and
Nardo (2006): consistency between the mathematical aggregation rule, used to construct composite indicators and the meaning of weights. Nardo, Saisana, Saltelli, Tarantola, Hoffman and
Giovannini
(2008)
: OECD
handbook i.e. a guide on constructing and using composite indicators, with a focus on composite indicators which compare and rank countries’ performances.
Polidoro
, Ricci and Sgamba (2006): novel methodology that expands on the methods detailed in Cecconi, Polidoro and Ricci (2004).Smith and Weir (2000): how to obtain some overall measure of quality by considering quality as a multivariate measure for any dataset, where each quality indicator represents one dimension of quality.Cherchye and colleagues (2008; 2009): propose developments of composite indicators with imprecise data and using DEA analysis.4
Slide5Steps in the construction of composite indicators
(OECD, 2008)Theoretical framework. Data selection. Imputation of missing data. Multivariate analysis. An exploratory analysis should investigate the overall structure of the indicators, assess the suitability of the data set and explain the methodological choices, e.g. weighting, aggregation.
Normalisation. Indicators should be normalised to render them comparable. Attention needs to be paid to extreme values as they may influence subsequent steps in the process of building a composite indicator. Skewed data should also be identified and accounted for.
Weighting and aggregation. Indicators should be aggregated and weighted according to the underlying theoretical framework. Correlation and compensability issues among indicators need to considered and either be corrected for or treated as features of the phenomenon that need to retained in the analysis.Robustness and sensitivity. Analysis should be undertaken to assess the robustness of the composite indicator in terms of, e.g., the mechanism for including or excluding single indicators, the normalisation scheme, the imputation of missing data, the choice of weights and the aggregation method.
Back to the real data.
Composite indicators should be transparent and fit to be decomposed into their underlying indicators or values.
Links to other variables.
Attempts should be made to correlate the composite indicator with other published indicators, as well as to identify linkages through regressions.
Presentation and Visualisation.
Composite indicators can be visualised or presented in a number of different ways, which can influence their interpretation.
Slide6Short literature review –
Dynamic SEMAllison (2014) and Moral-Benito (2013) claim that the dynamic SEM approach has several advantages over both GMM methods and previous ML methods: there is no “incidental parameters” probleminitial conditions are treated as completely exogenous and do not need to be modeledno difficulties arise when the autoregressive parameter is at or near 1.0missing data are easily handled by full-information maximum likelihoodcoefficients can be estimated for time-invariant predictors. (The A-B method cannot do this because it uses difference scores which causes all time-invariant variables to drop out)
many model constraints can be easily relaxed and/or testedIt is well known that likelihood-based approaches (ML) are preferred to method-of-moments (GMM) counterparts in terms of finite-sample performance (see Anderson,
Kunitomo, and Sawa 1982), and that ML is more efficient than GMM under normality. Moral-Benito (2013) compares the widely-used panel GMM estimator of Arellano-Bond (1991) with its likelihood-based counterpart and confirms these results in the case of dynamic panel models with predetermined regressors.
We
use
Dynamic
SEM as implemented in Stata6
Slide7Data and
method – SDIEurostat Sustainable Development Indicators (SDI) - The indicator framework covers 10 thematic areas belonging to the economic, the social, the environmental, the global and the institutional dimensions. We
include 123 indicators, listed on
next slide.Socioeconomic development;Sustainable consumption and production;Social inclusion;Demographic changes;
Public health;
Climate change and energy;
Sustainable transport;
Natural resources;
Global partnership;
Good governance.7
Slide88
Slide9Method - imputation
First phase: Dealing with missing dataMultiple imputations using Markov chains method (FCS, see
van Buuren et
al. 2006) which allows simultaneous imputation of variables with missing values
The
method
is
dependent
upon ordering of variables, our strategy was therefore to always put „substance“ indicators before the general development indicators in the imputation equations
Slide10Methodology – ordinary
and MHRM factor analysisSecond phase: regular and rotated factor analysis
upon standardised set of data
: transformation of the variables into quartilesResults of EFA showed that
we
can
speak
about at most 5 factorsMain statistical problem: small sample (much fewer units than variables)Some solutions in the literature: bootstrap correction (Fisher et al. 2014); corrected principal components estimator (Bai 2002) / maximum likelihood estimator (Bai & Li 2012); high-dimensional algorithms (Cai 2010a; Cai 2010b; Asparouhov & Muthén, 2012)We use Metropolis
-Hastings
Robbins
-
Monro
(MHRM)
algorithm
to
adjust
for
the
high
-
dimensionality
of
the
dataset
–
see
e.g.
Cai
2010a;
Cai
2010b
Related
algorithms
:
Bock
and
Aitkin
’s (1981) EM
algorithm
;
Joint Maximum
Likelihood (JML; see Baker & Kim, 2004)
;
SAEM algorithm
(
see
e.g.
McCullagh
&
Nelder
, 1989)
;
Gu
and
Kong
’s (1998)
stochastic
approximation
Newton–
Raphson
algorithm
;
Monte
Carlo
Newton–
Raphson
algorithm
(MCNR;
McCulloch
& Searle, 2001); Lange’s (1995) gradient
algorithm
;
Titterington
’s (1984)
algorithm
for
incomplete
data
estimation
Finally
:
hierarchical
clustering
,
strenghtened
by
non
-
hierarchical
K-
means
method
Slide11Methodology –
Metropolis-Hastings Robbins-Monro algorithm
Slide12Indexes estimation
Calculation of indexes: exploiting the fact that factors are by definition standardised
normal variablesIndex = ((Factor
+ 3) / 6) × 100transform the factors by adding 3 to each value (making them positive in approximately 99.86% cases)dividing their values by 6 (which is the range of the factor in 99.73% cases) multiplying by 100 to get the conventional scales of the index valuesCalculation of the final
index
:
modification
of
Fernando, Samita & Abeynayake (2012) – calculation of PCA and taking the average of first two principal components
Slide13Results: Explor
. factor analysis vs. MHRM135 general factors:General
development and governance issuesE
mploymentPoverty and social in/exclusionHealth care, environment
and energy consumption
Slide14Transformation into a
single index14
Slide15Clustering analysis,
final groupings
Slide16Clustering analysis,
final groupings(1) Eastern European countries: Bulgaria, Hungary, Poland, Romania, Slovakia,
Croatia(2) Baltic countries
: Estonia, Latvia, Lithuania(3) „Mediterranean“ countries: Greece, Portugal
(4)
„
Medium
developed
“ Western countries: Belgium, Ireland, Austria(5) Large Western countries: Germany, Spain, France, Italy, United Kingdom(6) Best achievers: Denmark, Luxembourg, Netherlands, Finland, Sweden(7) Outliers: Czech Republic, Cyprus, Malta, Slovenia
Slide17------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------
index2 |
index1 |
.4714223
.0403507
11.68 0.000 .3923365 .5505082 lnbdp1 | .1979494 .
0098281
20.14
0.
000
.1786866
.2172122
d1
| .5274382 .3873893
1.36
0.
173
-.2318308
1
.286707
d2
|
1.112265 .192206
0
5.79
0.0
00
.7355486
1
.488982
------------------------------------------------------------------------------
Number of units =
252
. Number of periods = 9.LR test of model vs. saturated: chi2(71) = 110.23, Prob > chi2 = 0.0020RMSEA = 0.065SRMR = 0.071CFI = 0.861TLI = 0.819
A
Dynamic
SEM
Stata
model
Slide18Discussion of the
main findingsFactor analysis and indexes construction:Clear set of 4-5 dimensions
: economic development
clusters into 2 dimensions (general economic development; and employment); social inclusion
is a
clear
and
separate component; environmental dimension cluster separately and in 1-2 dimensions; good governance does not cluster as a separate dimensionAbsence of a systematic difference between EFA and MHRM scores, furthemore the difference is not largeIn the level of indexes, clearly the countries of Socialdemocracti (Scandinavian)
and
Continental
regime
score
best
while
Eastern
European
score
worst
18
Slide19Discussion of the
main findingsSeven main clusters: (1) Eastern European countries; (2) Baltic
countries; (3) „Mediterranean“
countries; (4) „Medium developed“ Western countries; (5) Large Western
countries
; (6)
Best
achievers; (7) OutliersThe clusters do not differ significantly whether we take ranks or indexesMost Mediterranean countries (Spain, Italy, France) score closer to the Western countriesLiberal and/or Socialdemocratic countries do not from a clearly separate clusterEastern European countries have a subcluster
of
Baltic
countries
Several
outliers
– Malta,
Cyprus
,
Slovenia
,
Czech
Republic
:
they
score
better
than
Eastern
and
/or
Mediterranean
countries
but
slightly
worse
than the Western regimeThe dynamic SEM model confirms the validity of the construction and the path dependency of the index19
Slide20Conclusion and
discussionScientific relevance: New index of sustainable development, using sophisticated statistical
methodology.Solution to
the issue of small sample problem, frequently present in similar (macro-data
based
)
indexes
.
Empirical construction of clusters/groups of countries in the level of sustainable development, supporting some of the claims in the literature – has to be developed and tested in future.Limitations and future research: Improvements in statistical methodology, improving particularly the fit of the model (CFA) and distribution of the index
values
(MHRM
might
not
be
the
best
method
due
to
very
limited
sample
,
see
Cai
2010a)
Comparison
with
different
weighting
methods
and
different
sets of used SDI indicatorsValidation with different indexes of sustainable development and on the basis of worldwide dataTesting the methodology also for other indexes – for some applications see Srakar, Verbic & Copic (2015; 2016) and Srakar & Vecco (2016)20
Slide21Thank
you for listening and comments!andrej.srakar@ier.siandrej_
srakar@t-2.net
21