/
Using Administrative Data to Enhance Longitudinal Research Using Administrative Data to Enhance Longitudinal Research

Using Administrative Data to Enhance Longitudinal Research - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
409 views
Uploaded On 2016-05-25

Using Administrative Data to Enhance Longitudinal Research - PPT Presentation

Lorraine Dearden Director ADMIN Institute of Education Email ldeardenioeacuk NILS Research Forum Belfast 22 October 2010 Introduction In current economic climate using and linking administrative data very important for policy analysis ID: 334142

school data year administrative data school administrative year linked work survey longitudinal stage benefit key postcode results dwp individuals

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Using Administrative Data to Enhance Lon..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Using Administrative Data to Enhance Longitudinal Research

Lorraine DeardenDirector ADMINInstitute of EducationEmail: l.dearden@ioe.ac.uk

NILS Research Forum

Belfast

22 October 2010Slide2

Introduction

In current economic climate, using and linking administrative data very important for policy analysisScope for well funded longitudinal surveys going to be put under pressure

Also, for countries like NI, sample sizes in survey data not always satisfactory

NILS is a very welcome addition for researchers

Indeed colleagues at ADMIN using it to look at issues do to with health and migrationBut limited in scope as to what issues you can use it for and could be significantly enhanced with other administrative dataSlide3

Why so important to make better use of Administrative Data?

Administrative data has already been collected for administrative purposes so money spent

But the potential it gives for those interested in making sound policy advice immense if used correctly

Allows one to potentially follow multiple cohorts over time (longitudinal data) which is something survey data can rarely do

Sample size issues disappear in general which is very important when doing within country analysisSlide4

So why hasn’t it happened?

Fears over data protection...But this is always issue when any individual level data and the instances of researchers inappropriately using data virtually unheard of

The individual level data is highly disclosive but researchers never look at nor report anything that is disclosive

But is essential that this information is in their data at the individual level

Major issues around disclosure and data protection have been centred around agencies holding the administrative data Slide5

So how far have we got on this?

Have various LS with Scotland the most advanced in terms of linkage (including linkage to schools data)Serious discussions in government about whether Censuses could be replaced by linking administrative data

So politicians and policy makers are talking about it

Certain departments in Whitehall have started linking administrative data sets for internal use (ONS) whereas others have linked data for research projects for them (e.g. DWP) and yet others for general research purposes (DfE and BIS)Slide6

Another important development

There is increasing linkage of survey data to Administrative data where consent has been obtained from the individuals in the survey

Longitudinal Survey of Young People in England (linked to NPD data)MCS (and ALSPAC) linked to hospital registration data, NPD data and now have permissions to link to Hospital Episodes Data, Economic Data held by DWP and HMRC (for both parents) as well as NPD data for all siblings of CM

ELSA has linked to health and economic data and NCDS and MCS are about to do this as well

Innovation Panel of Understanding Society will do this in a few years with hope of rolling it out to full sampleSlide7

Why is this important?

New linked admin/longitudinal data has potential to:Get a better understanding of the implications of missing covariates in administrative which is crucial if we are going to rely more on administrative data linkage

Get a better understanding of implications of attrition and non-response in survey data

Allow us to understand the implications and extent or recall bias in surveys……Reduce the costs of longitudinal survey dataSlide8

So what administrative data is there?

Some, like data on school children, is country specificOthers like HESA (Higher Education), DWP and HMRC data covers all of Great BritainNow going to talk a bit about what is out there in terms of administrative data...Slide9

New longitudinal HE admin data

Linked individual-level administrative dataSchool (NPD), FE (ILR/NISVQ) and HE (HESA) recordsData on participants AND non-participants in HEFour cohorts:

In Year 11 in 2001-02, 2002-03, 2003-04 and 2005-05Potential age 18/19 HE entry in 2004-05, 2005-06, 2006-07, 2007-08 or (age 19/20 entry 2005-06, 2006-07 and 2007-08)State and private school studentsSlide10

Data

Socio-economic backgroundFree school meals status from PLASCIMD quintiles based on home postcode (age 16)Gender, MOB and school ID available for all

Ethnicity, EAL, SEN from PLASCMissing for private school kids

Neighbourhood measure of parental education based on 2001 CensusBased on home postcode for state school analysis

Based on school postcode when include private school kidsSlide11

Data

Prior attainment State school :Average point score at Key Stage 2, 3, 4 and 5 (plus indicators of reaching expected level at Key Stage 4 and 5)Private school :

Key Stage 4 and 5 results onlySlide12

Integrated administrative data set

School dataCensus of school children with individual characteristics of all pupils e.g. gender, ethnicity

Prior achievement from age 11 through to 18

Individual Learner RecordFE college attended

Participation and qualifications achieved

Higher Education data

Detailed information on degree subject, institution, degree class awarded

for

all those participating in HESlide13

Destinations of Leavers from Higher Education survey (DLHE)

Early DHLE Survey (surveys graduates 6 months out of university) – only preliminary snapshot of graduate successIn 2006, HESA carried out a follow up to the Early DHLE Survey → Longitudinal DLHE – 3 years after graduationContains full details of HE plus wages / occupation 3 years after graduationSlide14

Longitudinal DLHE

Can tell us early value of degrees By subjectBy institutionPossibly by subject and institution (subject to sample size)Data essentially owned by universities so would need their permission to do thisSlide15

What data is included within

NPD?Key Stage 1 Results

Keys: PupilID, Academic Year, Lea/Estab

Key Stage 2 Results

Keys: PupilID

, Academic Year, Lea/Estab

Key Stage 3 Results

Keys:

PupilID

, Academic Year, Lea/Estab

Key Stage 4 Candidate

Keys:

PupilID

, Academic Year, Lea/Estab

Key Stage 5 Candidate

Keys:

PupilID

, Academic Year, Lea/Estab

Foundation Stage Profile

Keys:

PupilID

, Academic Year, Lea/Estab

Schools census (formally PLASC)

Keys:

PupilID

, Academic Year, Lea/Estab, Pupil postcode

Key Stage 4

Results

Key Stage 4 Indicators

Key Stage 5 Indicators

Key Stage 5 Results

Information Learner Record - Aims

Keys:

PupilID

, Academic Year, Lea/Estab

Year 7 Progress Test Results

Keys:

PupilID

, Academic Year, Lea/Estab

Core Pupil

Keys:

PupilID

, Academic Year, Lea/Estab, Pupil postcodeSlide16
Slide17

Main fixed pupil characteristics from School Census

Main indicators:Sex of child

Age (month of birth is standard release)Ethnic group

English as an additional language

Are they time-invariant?We might collect several measures of each, e.g. one from each of KS4, KS2, KS1 sweeps and also up to nine years of Pupil Census reports from schools

We think of these characteristics as fairly time-invariant, yet they vary for a tiny minority of children

You can place greatest weight on most recent reports, or alternatively place greatest weight on the modal report of their characteristicSlide18

Time-variant pupil characteristics

FSM eligibleSENPostcode, LLSOA, IDACI rankConnexions, gifted and talented (variable school recording of this)Mode of travel (new)Part-time, borderSlide19

Obtaining geo-classifications for home addresses

Standard release:

DCSF will release a lower level super output area to indicate where the child livesLLSOA – geographical area with a minimum population of 1,000, nested within census ward boundaries

Secure release:

DCSF will release child’s home postcode to researchers who make a case for it and can show data will be held securelyHome postcode – geographical area with an average of 11 households, giving a relatively precise (within 100m) geo-location

WILL NOT release if you just want to attach geo-data to the postcode (they will do this for you)

WILL NOT release if you just want to calculate home-school distances, find the nearest school etc (they will do this for you)Slide20

Access to NPD data

Most researchers can access this dataHave to outline their research question, the data they need, make a case for any special additional variables that are thought to be disclosive (e.g. date ofbirth, postcode) and provide evidence that data will be held securely (never on laptop or desktop etc)Data is transfered via a encrypted electronic transferIf want to use data for new research project, need to approach DfE again before using dataSlide21

NI Schools Data

Have similar data though not so detailed results data. Basic outcomes at KS2, KS4 and KS5Census data comparable and in some cases more richBut have potential to link this to HESA data and graduate destinations survey as wellSlide22

Access to linked HESA/NPD data

This access occurs through BIS who have done the linkageAgain need to outline research question and make case for dataAgain transfer is via electronic encrypted transfer (FTP site) and host organisation has to demonstrate has secure facilities where data will be keptSlide23

DWP and HMRC data: WPLS

The DWP has linked all DWP benefit and program participants to HMRC employment and earnings data (from P14 returns) since 1998This is called the WPLS (Work and Pensions Longitudinal Study)

Permission to link this to FRS, NCDS, MCS and ELSA surveys as well (consent obtained from individuals in these surveys)

A summary of its uses can be found here http://statistics.dwp.gov.uk/asd/longitudinal_study/WPLS_Uses.pdfSlide24

WPLS

Researchers have had access to this data when carrying out work/evaluations for DWP What data does not include is HMRC records for individuals who have not been on DWP program or benefits so not as good as it could be...

But surveys who have sought permission to link to DWP and HMRC data can link to this additional HMRC data (e.g. FRS, ELSA, NCDS and MCS)

Collecting data on benefit receipt typically difficult to do in surveys so this linkage extremely valuable and saves survey time costs

This data covers whole of Great Britain – not just EnglandSlide25

HMRC NIC data

HMRC has records on individual NI contributions since NI was introduced in 1948Originally only 1% of sample was held electronically but now all of these records are electronically held by HMRC

The English Longitudinal Survey of Aging (ELSA) has linked all individuals in its survey who gave consent for linkage to this NIC data which means they have earnings and employment history for their sample from 1948

Up until recent changes in NI for those above UEL, do not know earnings above UEL but this reasonably small proportion for most time periods and no longer an issue

This data going to be linked to NCDS and MCS (where consent rates were in excess of 80%)Slide26

Other dataGP registration data (NILS at forefront here)

Hospital Episodes Data Home Office data on crimes (have individual level information)Birth, marriages and death registration data (NILS again at forefront here)Slide27

How has this linked ADMIN data been used by researchers?

Going to shamelessly focus on some of the work I have done with this dataNot always successful as I will demonstrate – and this linked administrative data not always up to research taskBut has great potential to answer lots of policy relevant questionsSlide28

Widening participation in HE

Joint work with Chowdry, Crawford, Goodman and VignolesShows that prior school attainment is main reason for large gap between rich and poor in:HE participationParticipation in a ‘high status’ universitySuggests HE funding reforms are not best tool for addressing social mobility/‘access’ issues.

Focus instead must be on improving school attainment amongst poor childrenUses linked school, FE and HE administrative data to assess schooling roots of large SEP gapSlide29

Widening participation in HESlide30

Month of birth effects

Joint work with Crawford and MeghirChildren born in September start school aged 5 whereas those born in August are almost a year youngerDoes this impact on longer term educational outcomes?Used samed linked data to look at this questionFound being born in August has prolonged impact on educational outcomes and even reduces probability of entering HESlide31

Raw differences (proportion getting expected level)Slide32

Summary of findings

August-born children experience significantly poorer education outcomes than September-born childrenAlmost entirely due to differences in the age at which they sit the tests

Starting school earlier/having more terms of school is marginally better for August born children at younger ages Slide33

Ethnic Parity in JCP services in UK?

Joint work with Crawford, Mesnard, Shaw and Sianesi at IFS

Ethnic parity:

No difference on average between Ethnic Minority and “otherwise identical” White entering the same JCP office and accessing same program/benefit

Our aim:Get as close as possible to “otherwise identical” White and see what difference remains

Calculate results for a range of JCP benefits and programsSlide34

Programs and Benefits

 Incapacity benefit (IB):

paid to individuals who are assessed as being incapable of work and who meet certain National Insurance contributions conditions.

Income support (IS): a benefit for individuals on low income; usually claimants are lone parents, sick or disabled, or carers.

Jobseeker’s allowance (JSA): a benefit paid to individuals of working age who are unemployed, or who work fewer than 16 hours per week and are looking for full-time work.

New Deal for Lone Parents (NDLP

): a voluntary programme whose aim is to encourage lone parents to improve their work prospects and help them into work.

New Deal for individuals aged 25 plus (ND25plus

): a programme to help unemployed individuals aged 25 and over to find and keep a job. Participation is compulsory for individuals who have been claiming JSA for at least 18 of the previous 21 months.

New Deal for Young People (NDYP):

similar to ND25plus except that it is targeted on individuals aged 18-24. Participation is compulsory for those who have been claiming JSA for at least six months.Slide35

Controlling for selection

Control for differences in observed characteristics between ethnic groups that may affect outcomes

Data:

Detailed labour market historiesIndividual background characteristics

Methods:

Primarily propensity score matching (PSM)

Also regression-based methods and conditional difference in differences (DID)

Previous LM history may have been affected by discrimination but nothing we can do about thisSlide36

Sampling frame

Sample selected on inflow into programmeAddresses differential selection off programmeSlide37

Sampling frame

Sample selected on inflow into programmeAddresses differential selection off programmeInflow window is 2003, allowing:3-year pre-inflow labour market history1-year follow-up

Inflow window

Previous labour market history

Outcomes

Dec 2004

Jan 2000

2003Slide38

Outcomes of interest

Two dimensions of labour market statusIn employment (15+ days in the month)On benefit (15+ days in the month)Benefit definition includes:IS, IB, JSA, New Deal options, Basic Skills and Work-Based Learning for AdultsMeasured monthlySlide39

Data

Primarily Work and Pensions Longitudinal Study (WPLS)Benefit and employment spells for anyone on a DWP benefit since mid-1999Also contains limited demographics including sex, DOB, ethnicity and postcodeAlso used National Benefit Database (NBD) and census informationSlide40

X variables

Employment and benefit historyPast participation in voluntary programmesPast participation in Basic SkillsIndividual characteristics

Gender, age, month of inflowProxies for education and wealth (from census)Local area characteristics (region, travel-to-work-area unemployment)

Other programme-related informationSlide41

What did we find?

For most programs and benefits (with exception of IS and IB), Minorities and Whites are simply too different for satisfactory estimates to be calculated and results are sensitive to the methodology used. MASSIVE COMMON SUPPORT PROBLEMS

This calls into question previous results based on simple regression techniques, which may hide the fact that observationally different ethnic groups are being compared by parametric extrapolation.

In some cases, depending on method used, eg NDLP we could find significant ethnic

penalites in employment (raw and DID), no ethnic penalty (regression methods) and significant ethnic premium (PSM)Slide42

IB: raw labour market statusSlide43

IB: overall employment result

Reliability of matching: CS(0), UC(28) (i.e. reliable according to our criteria)Slide44

IB: overall benefit result

Reliability of matching: CS(0), UC(28) (i.e. reliable according to our criteria)

*

*

*

*Slide45

Need other methods to do this properly

Using administrative data to analyse this question very problematicProblem due to the fact that the Ethnic Minority and White clients accessing the same JCP office are very different in the UK with exception of IS and IB recipientsMight not be problem in other countries but could be.......Not problem with ADMIN data – just can’t be used for this question