/
1. InGRID-2   Data  Forum - 1. InGRID-2   Data  Forum -

1. InGRID-2 Data Forum - - PowerPoint Presentation

cheeserv
cheeserv . @cheeserv
Follow
343 views
Uploaded On 2020-09-29

1. InGRID-2 Data Forum - - PPT Presentation

WageIndicator 31082017 Objective Opportunities amp challenges of using the global WageIndicator data for scientific and policy driven research First session Introduction to the ID: 812434

survey amp data web amp survey web data surveys research probability volunteer online population weights weighting sample date liss

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "1. InGRID-2 Data Forum -" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

1. InGRID-2

Data Forum - WageIndicator31.08.2017

Slide2

Objective

Opportunities & challenges of using the global WageIndicator data for scientific and policy driven researchFirst session: Introduction to the

WageIndicator data and how users deal with the challengesSecond session: Two perspectives on the use of non-probability samples + roundtable discussion

Aim:

new ideas of how to deal with the challenges of the WageIndicator data

Title/date

Slide3

Some words about InGRID2

Title/date

Slide4

InGRID2 - Objectives

Integrate and innovate existing European social sciences research infrastructures

on Poverty and living conditionsWorking conditions and vulnerability by

improving

:

Transnational data accessMutual knowledge exchange through

activities

Methods

and tools for comparative research to create new & better opportunities for developing evidence-based European policies on Inclusive Growth

Title

/date

Slide5

About InGRID2 - Organization

19 partners in a consortiumClustered in 2 pillars and 3 themes:

Pillars: Poverty and living conditions & Working conditions and vulnerabilityThemes: Data integration & harmonization, Evaluation & analysis tools, indicator building

4 types of activities:

Summer schools & expert workshops & network activities

Visiting grants to data infrastructuresJoint research activities

E-portal

5

Slide6

Session 1: The

WageIndicatorKea Tijdens, Martin Kahanec, Brian Fabo & Stephanie Steinmetz

Title/date

Slide7

“How to deal with biases in

volunteer web surveys?” Some explorations for the NetherlandsStephanie Steinmetz

Title/date

Steinmetz, S.; A. Bianchi, S. Biffigandi; K. Tijdens (2014):

Improving web survey quality - Potentials and constraints of propensity score weighting (chapter 12, pp. 273-298).

Callegaro, M., R. Baker, J. Bethlehem, A. Göritz, J. Krosnick, P. Lavrakas (Hrsg.):

Online Panel

research: A Data Quality Perspective. Series in Survey Methodology, Wiley.

Slide8

What is the problem?

8

AAPOR Report on Online Panels (2010)

“Researchers should avoid non-probability online panels when one of the research objectives is to accurately estimate population values. [...] Thus, claims of ‘representativeness’ should be avoided when using these sample sources.”

Slide9

Wages:

central for socio-economic researchWage data collection is challenging (admin. or survey data)Central questions: Are wages collected via a (volunteer) web survey representative for a selected target population?

If not  how can representativeness be achieved?

Objectives

Slide10

(Volunteer) web surveys

Advantages

(time & cost reduction, interactivity, flexibility, ‘worldwide’ coverage, interviewer influence (-))

Disadvantages

People with web access volunteer/‘opt-in’

(respondents have an

unknown selection probability

)

Representativeness?  can web survey estimates be generalised to the target population? 

Various meanings:

‘representativeness’

(

Kruskal &

Mosteller,1979)

Sample data gain validity in relation to target population they are meant to represent

Slide11

Sources of errors in a (web) survey

Coverage: number of people having internet access + differences between persons with & without internet. Sampling/self-selection:

no comprehensive list of Internet users to draw probability-based sample + people with specific characteristics participate in a (volunteer) web survey.Non-response: not all persons finish questionnaire, people with specific characteristics might have higher non-response

.

+ Measurement errors and processing errors

11

Slide12

Can weighting solve the problem?

Slide13

Reasons for weighting

In generalAdjusts for unequal probabilities of selectionAdjusts for nonresponseImproves precision of survey estimator

(variance reduction) using auxiliary information (Bias

of unweighted estimator

=

difference between sample & target population)In particular for web surveysAdjusts for under coverage & self selection

Slide14

Calibration, post-stratification, raking etc.

Corrects for mainly for socio-demographic differences between sample & target population (Loosvelt &

Sonck 2008, Steinmetz et al. 2013) Limited impact 

corrects for

proportionality

but not necessarily for representativenessYeager et. al. (2011) comparison

Average absolute error for 13 secondary demographics and non-demographics,

(weighted)

Slide15

Propensity score adjustment (PSA)

Origin: experimental studies (Rosenbaum & Rubin, 1983

), use of propensity score for group comparisonsPrinciple idea: Achieve representativity through a representative reference survey & model self-selection of respondents into web survey

(

PS

=likelihood that a respondent participates in web rather than reference survey)BUT: Relies on strong requirements

Findings:

non conclusive

(

e.g. Valliant &

Dever, 2011)

Slide16

Unique data

Dutch LW (non-probability, Oct.-Dec. 2009, N = 1693)

LISS panel (probability-based, Oct. 2009, N = 1063) Population information (CBS, 2009)

Both data sets (LISS & LW)

Identical questionnaires & same mode (Internet)

Variety of 8 ‘

webographic

’ questions

Allow to apply

a better exploration of sample biases (PI, LISS, LW)

several calibration weights (using PI)  4 weights

several PSA weights

 12 weights

Application

How selective is the data?

Slide17

Average Relative Differences between CBS & LW + CBS & LISS

Variable

CBS-LW

CBS-LISS

Working Time

0.61

0.46

Sector

0.30

0.15

Age

0.28

0.32

Education

0.27

0.28

Occupation

0.17

0.30

Gender

0.16

0.07

Type of contract

0.06

0.31

Slide18

Bias: mean monthly gross wage

In comparison to population, both surveys show wage bias!

Slide19

Linear weighting & ratio raking

IDEA: assign weights such that weighted sample ‘resembles’ population (for selected covariates)if there is a strong relationship between covariates & target variable  estimates will improve!Variables:

working time, sector, gender, education, agePropensity score adjustment (PSA) LISS serves as reference survey (adjusted)PS=likelihood that a unit participates in LW rather than LISS

IDEA: give higher weights to those who are less likely to partcipate in the LW

Weighting

strategy

4 ‘traditional’ & 12 PS weights

Slide20

Results – adjustment of mean wages



Slide21

Use of PSA can help to reduce biases of a volunteer web survey

(mean monthly wage in LW).But: - most efficient PS type (ungrouped) shows

greatest variability requires further adaptions (trimming etc.)!

- detailed by groups, improvements for all covariates but work not homogenous within PS weight

Set of webographics does not increase efficency.

In sum

Slide22

What have we learned?

22

To weight or not to weight that is the question

Slide23

PSA can work if…

we have a proper reference survey we have meaningful covariates (also

webographics) we can exclude mode effect,

we have the same questionnaire,

we have ignorable non-response

Very strong requirements !

Slide24

Requirements are rarely fulfilled !

Success dependent on pre-selection conditionsPopulation information difficult to access (one country!)Definition of the target variable

Missings on core variables (systematic?)How to deal with a biased reference survey? Reduction of estimation biases often causes

higher variance

Attitudinal questions are less reliable (might

depend on current circumstances, vary over time)

measurement error

Challenges

Slide25

Is there a future for volunteer web surveys?

Possible solutions for representativeness: Improving weights

(imputation, better model specification, complex weighting adjustments) Only mixed-mode surveys (time & cost-reduction disappears)

Non-representative use of volunteer web survey

data

(only for explorative analysis) OR Discuss meaning representativeness

Survey quality

absolute

 assess quality of non-probability samples (see AAPOR report 2013) Transparency is important

Slide26

Representativeness of surveys

Source: Fabo, B. (2017), p. 47

AAPOR Report on Online Panels (2010)

However, at the same time

“There are times when a non-probability online panel is an appropriate choice. […] there may be survey purposes and topics where the generally lower cost and unique properties of Web data collection is an acceptable alternative to traditional probability-based methods.”

Slide27

Thank you for your attention!

Contact:

s.m.steinmetz@uva.nl

For more information on

WageIndicator

: www.wageindicator.org

Slide28

Coffee Break

- Let’s take a group picture  & start networking

!Title/date

Slide29

Session 2:

Roundtable discussionUlrich Kohler & Sander StijnTitle/date

Slide30

Bandilla, W., Bosnjak, M. & Altdorfer, P. (2003). Survey administration effects? A comparison of web-based and traditional written self-administered surveys using the ISSP environment module.

Social Science Computer Review, 21, 235-243; Bethlehem, J. (2010). Selection Bias in Web Surveys. International Statistical Review, 78, 161-188.

Bethlehem, J. & Stoop, I. (2007). Online panels - a paradigm theft? In M. Trotman et al. (Eds.), The challenges of a changing world (pp. 113-131). Proceedings of the 5th International Conference of the Association for Survey Computing. Southampton: Association for Survey Computing.

Duffy, B., Smith, K., Terhanian, G. & Bremer, J. (2005). Comparing data from online and face-to-face surveys.

International Journal of Market Research, 47

, 615-639 Kruskal & Mosteller,1979Loosveldt, G. & Sonck, N. (2008). An evaluation of the weighting procedures for online access panel surveys.

Survey Research Methods, 2

, 93-105.

Rosenbaum, P. & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects.

Biometrika, 70

, 41-55.Schonlau, M., van Soest, A., Kapteyn, A. & Couper, M. (2009). Selection bias in web surveys and the use of propensity scores. Sociological Methods Research, 37, 291-318.Steinmetz, S., D. Raess, P. de Pedraza, K. Tijdens (2013): Measuring wages worldwide - exploring the potentials and constraints of volunteer web surveys (chapter 6, pp.100-119). Sappleton, N. (Hrsg.):

Advancing Social and Business Research Methods with New Media Technologies

. Hershey, PA: IGI Global.

Steinmetz, S.; A. Bianchi, S. Biffigandi; K. Tijdens (2014):

Improving web survey quality - Potentials and constraints of propensity score weighting (chapter 12, pp. 273-298).

Callegaro, M., R. Baker, J. Bethlehem, A. Göritz, J. Krosnick, P. Lavrakas (Hrsg.):

Online Panel

research: A Data Quality Perspective. Series in Survey Methodology, Wiley.

Taylor, H. (2005). Does Internet research ‘work’? Comparing online survey results with telephone surveys.

International Journal of Market Research, 42

, 51-63

Valliant, R. & Dever, J. (2011). Estimating propensity adjustments for volunteer web surveys. Sociological Methods, 40, 105-137. Yeager, D.S., Krosnick, J.A., Chang, L., Javitz, H.S., Levendusky, M.S., Simpser, A. & Wang, R. (2011). Comparing the accuracy of RDD telephone surveys and Internet surveys conducted with probability and non-probability samples. Public Opinion Quarterly, 75, 709-747.References