1 The Utility of Metadata for Questionnaire Design and Evaluation Jim Esposito Bureau of Labor Statistics Washington DC Disclaimer The views and opinions expressed in this presentation are those of the presenterauthor and not necessarily those of the Bureau of Labor Statistics or the Bure ID: 799802
Download The PPT/PDF document "24 April 2007 QUEST2007: Statistics Cana..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
24 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
1
The Utility of Metadata for Questionnaire Design and Evaluation
Jim Esposito
Bureau of Labor Statistics
Washington, DC
Disclaimer:
The views and opinions expressed in this presentation are those of the presenter/author and not necessarily those of the Bureau of Labor Statistics or the Bureau of the Census.
Slide224 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
2
Objectives of Presentation
To draw attention to the concept of metadata and to its scope and relevance
To describe a case study involving the measurement of work/employment that illustrates the utility of metadata in evaluating and designing questionnaire items
Slide324 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
3
Metadata: An Informal Definition
Metadata
can be defined as any information (verbal or numeric or code, qualitative or quantitative) that provides context for understanding survey-generated data:
Domain-specific/ethnographic information
Concepts and question objectives
Questionnaire items and administration modes
Instructional materials
Pre- and post-survey evaluation research
Classification algorithms
Slide424 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
4
The Measurement of Labor-Force Status via the CPS
Current Population Survey [CPS]
Official source of LF statistics in USA (e.g., monthly unemployment rate)
CPS measures
work
,
not jobs
60,000 households a month
Principal LF categories: Employed [
EMP
], unemployed [
UE
], not-in-the-labor-force [
NILF
]
Employed:
Work for pay, one hour or more; unpaid work in family business, 15 hours or more; job (but absent last week)
Data collected monthly via two modes [face-to-face and telephone CAPI; centralized CATI]
Slide524 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
5
CPS: Some Relevant Details
The CPS was redesigned in the early 1990s, utilized a multiple-method of evaluation plan (e.g., behavior coding, interviewer and respondent debriefings, split-ballot design) and generated a substantial amount of metadata
The CPS relies on about 16 questionnaire items to generate estimates for its three major
labor force
categories: EMP, UE and NILF (and various subcategories)
Again, CPS measures
work
,
not jobs
Slide624 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
6
The Measurement of Employment Status via the ACS
American Community Survey [ACS]
Largest survey conducted in the USA; will replace the Decennial Census “long form”
250,000 households a month
Collects data on a broad range of demographic topics (e.g., population, housing, disability status,
employment status
, educational attainment, health insurance)
Adheres to BLS employment concept with the same three major categories: EMP, UE and NILF
Data collected continuously via three modes [SAQ (
66%
), CATI and face-to-face CAPI)
Slide724 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
7
ACS: Some Relevant Details
The ACS was developed over a series of stages (starting in the early 1990s) and achieved full implementation in 2005; there is a substantial amount of metadata documenting this process
At present, the ACS relies on the content of six CPS items (modified for use in the ACS) to generate its estimates for three
employment status
categories: EMP, UE and NILF
Because of methodological/procedural differences, the CPS and the ACS
can not be expected to produce identical estimates
Slide824 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
8
CPS: Work Item and DQ Issues [1]
CPS Work Question
[No-business-in-household wording.]
LAST WEEK, did you do ANY work for pay?
Data Quality [DQ] Issues, CPS Redesign
Final evaluation phase (1992-93): Interviewers rated this item as one of the more problematic questions on the redesigned CPS (e.g., Just my job?; Do you mean my regular job?)
On the basis of other evaluation data (e.g., behavior- coding and response-distribution analyses), these “reports” by respondents were determined not to represent a serious data-quality issue because of the likelihood of interviewer mediation and “repair work”
Slide924 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
9
CPS: Work Item and DQ Issues [2]
Data Quality Issues (continued)
Respondent debriefing data indicated that this question did miss some marginal/paid work activity (1.6%): “In addition to people who have regular jobs, we are also interested in people who may only work a few hours per week.
Last week
, did [name] do any work at all, even for as little as one hour?”
The evaluation work conducted during the redesign was documented extensively by Census Bureau and BLS researchers in the 1990s (e.g., conferences; papers; book chapter); however, much of this work is not cited in ACS evaluation documents
Slide1024 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
10
ACS: Work Item and DQ Issues [1]
Current ACS
LAST WEEK, did this person do ANY work for either pay or profit?
Mark (X) the “Yes” box even if the person worked only 1 hour, or helped without pay in the family business or farm for 15 hours or more, or was on active duty in the Armed Forces.
Data Quality Issues
ACS underestimates employment (which compromises estimates in the other two categories, UE and NILF)—
next slide
Slide1124 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
11
CPS vs. C2000/ACS Estimates
CPS/Census-2000 Match Study
“Combined-Month Sample”: February though May, 2000, specific rotations;~86,000 addresses; wt. N: 207,875,749
CPS
vs. ACS-like employment status items
EMP:
64.1%
vs. 62.3% (underestimate)
UE:
2.7%
vs. 3.4% (overestimate)
NILF:
32.8%
vs. 34.0% (overestimate)
Note:
The employment status items from the Census-2000 long form are identical to those used in the current ACS.
24 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
12
ACS: Work Item and DQ Issues [2]
Data Quality Issues
Small–scale evaluation [2004]: Expert reviews; behavior coding; focus groups with ACS interviewers
Behavior coding
[CATI site; 51 HHs; 104 persons]
:
INT codes: exact (78%); major changes (10%); data due in part to prior context [disability questions]
RSP codes: adequate answers (98%); other than simple yes or no (21%); examples (e.g., “For pay, yes.”; Just his “regular job.”; “No, currently unemployed.”)
Read-if-Necessary Statement: Never read
Focus groups:
“pay
or profit
” confusing; multiple-job holders and self-employed (e.g., “Did you mean, other than my regular job?”); read-if-necessary statement rarely read; some interviewers ask about job directly
Slide1324 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
13
ACS: Revised Work Items
Revisions to ACS Work Question
(1A): LAST WEEK, did this person work for pay at a job (or business)?
[If “no” to 1A, ask (1B).]
(1B): LAST WEEK, did this person do ANY work for pay, even for as little as one hour?
Rationale
Current ACS work question confuses some respondents:
Why?
Exploiting two-part question appears to clarify the response task for some respondents and in so doing better achieves the objective of gathering accurate data on work activity and employment status
Slide1424 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
14
Estimates of Labor-Force/ Employment Status
2006 ACS Content Test
January—March 2006; ~ 63,000 addresses, equally split between control/current vs. test/revised groups
Current vs.
revised
ACS items
EMP: 62.8% vs.
65.7%
(plus 2.9%)*
UE: 4.1% vs.
3.6%
(minus 0.5%)
NILF: 33.1% vs.
30.7%
(minus 2.4%)*
Revised items manifest less bias and variability, as well
Slide1524 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
15
The CPS Work Item: Why might it be problematic for some respondents?
Grice (1975):
Maxims on
Quantity
1. Make your contribution as informative as is required (for the current purposes of the exchange).
2. Do not make your contribution more informative that is required.
Fowler (1995):
Principles 3 and 3d.
Principle 3: A survey question should be worded so that every respondent is answering the same question.
Principle 3d: If what is to be covered is too complex to be included in a single question, ask multiple questions.
Slide1624 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
16
Invoking Grice on Quantity: Hypothetical Example [ACS/SAQ]
LAST WEEK, did you do ANY work for pay?
Respondent [full-time job]:
How should I answer this [#!&?@] question? It’s doesn’t mention a “job” and probably would if that’s what they wanted to know. And it specifically says “work for pay”, so it must mean doing work on the side. OK, just check the “
no
” box.
Reference to a “job” is missing. [Maxim 1]
“Work for pay” is specified, which would seem superfluous (especially for someone with a full-time job): Who works all those hours for free? [Maxim 2]
Slide1724 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
17
Resolution for ACS: Two-Part Work Item
Revisions to ACS Work Question
(1A): LAST WEEK, did this person work for pay at a job (or business)?
[If “no” to 1A, ask (1B).]
(1B): LAST WEEK, did this person do ANY work for pay, even for as little as one hour?
Part (1A) specifically mentions “job”, “work for pay” and “business”.
Part (1B) captures work for “as little as one hour”?
Not perfect, but better than current ACS item.
Slide1824 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
18
Closing Remarks
Even survey questions that appear simple and straightforward
may not be
for some respondents. [Key issues: Why and how many respondents affected?]
It is risky to import questions from one survey to another, especially when the surveys differ in terms of mode of administration (and in various other ways, too).
In evaluating and “fixing” questionnaire items, quantitative research, alone, is not sufficient.
Summary:
Our best hope for optimizing data quality (i.e., minimizing measurement error) is a thorough and critical review of relevant metadata, followed by prudent design-and evaluation decisions that are informed by such reviews.
Slide1924 April 2007
QUEST2007: Statistics Canada, Ottawa, Canada
19
Thank You
Questions or comments?
Post-workshop:
Esposito.Jim@bls.gov