/
Matching of administrative data to validate the 2011 Census Matching of administrative data to validate the 2011 Census

Matching of administrative data to validate the 2011 Census - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
430 views
Uploaded On 2016-08-01

Matching of administrative data to validate the 2011 Census - PPT Presentation

NRS amp RSS Edinburgh October 2012 AGENDA Context 2011 Census quality assurance and the role of administrative data Data matching challenges and solutions Data to be matched Matching methods and interpretation ID: 428881

data census matching address census data address matching matched unmatched outcomes level 2011 register sex methods female london analysis

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Matching of administrative data to valid..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Matching of administrative data to validate the 2011 Census in England and Wales

NRS & RSS Edinburgh

,

October

2012Slide2

AGENDA

Context: 2011 Census quality assurance and the role of administrative data

Data matching challenges and solutions

Data to be matched

Matching methods and interpretation

Substantive results

so far . . . Slide3

An overview of the methods

5 yr age/sex CCS areas

5 yr age/sex EA /LA level

1 yr age/sex OA level

DSE

Bias

adj

Overcount

Ratio estimator Nat

adj

Coverage imputation

Product

Method

Supplementary analysis

Core checks

Main QA Panel

High Level QA Panel

First Release

QA Review and sign-off

Quality assuranceSlide4

Challenges and solutions

Issue

Solution

Matching limited to small QA ‘window’

Match selected

LAs ahead of QA

Some data not available in advance

Flexible data architecture so new sources can be addedResearch questions only emerge during QAStratified approach to matching so the methods were tailored to the questionsScale of matching task potentially huge

Initially restrict matching to CCS postcode clustersOne: many address matchesRevised address data architectureSlide5

Data to be matched

Census

Non-Census

Post-out Address Register

NHS Patient Register

Address Register History File

Higher Education Statistics Agency (HESA) dataCensus returns

English and Welsh School Censuses‘Associated Address’ dataElectoral RegistersCensus Management Information System

Valuation Office Agency dataSlide6

Methods

Data cleaning, de-duplication,

standardisation

, quality analysis

Definitional alignment with Census enumeration base

Exact matching (dwelling: Address/ person: name, DoB, gender and postcode)Score-based address matchingProbabilistic person matching

Clerical resolution of candidate pairs from automatch

Clerical search for unmatched residualsResolution of unmatched residuals against the Address Register History file and Census ‘associated addresses’Evidence-based assessment of residualsSlide7

Interpretation: Who is actually present?

Non-

URs

Census non-usual residents (matched and unmatched to PR)

PR records unmatched to Census respondents and assessed as not present

Matched to address deactivated in the field

Matched to unoccupied or vacant/absent/ 2

nd res dummyMatched to ARHF invalid address

UR elsewhere, this is Usual Address 1 Year Ago Matched to Census UR elsewhereUnaccounted

Unmatched and unaccounted forPR records unmatched to Census respondents and assessed presentPR matched to Census missed/ unaccounted-for address

PR matched to address with ‘occupied’ dummy

PR validated through other administrative sourcesPR/ Census confirmed URs

PR/ Census matched recordsCensus URs unmatched to PRSlide8

Match rates in a ‘control’ LASlide9

Fem

ale outcomes in a ‘control’ LASlide10

M

ale outcomes in a ‘control’ LASlide11

M

atch results in university townsSlide12

University town: female outcomesSlide13

University town: male outcomesSlide14

London: population churnSlide15

London churn: female outcomesSlide16

London churn: male outcomesSlide17

London LA: implied sex ratiosSlide18

Data mining to address specific Census/PR anomalies

University

Hall of Residence

GP registrations/Hall capacitySlide19

Female students living in halls in April 2011

by NHS Authority acceptance dateSlide20

Male students living in halls in April 2011

by NHS Authority acceptance dateSlide21

LA summary: proportion of F4s and proportion unresolved, within CCS postcode clustersSlide22

LA summary: concentration of Flag 4s in the PR residualSlide23

LA summary: LA types, residual size and Flag 4sSlide24

Further investigations

Planned analysis of the PR residuals’ addresses and households to identify ‘ghost’ records

Longitudinal matching of the 2012 Patient Register to 2011 data to identify registrations that have been cancelled by GP practices in the year following Census

Cluster analysis of all E&W

LAs

to see whether the typology of LAs identified through matching is mirrored in list inflation patterns nationallyMulti-level modelling to summarise results, with individual and area level explanatory variables