Evaluation design for Achieve Together Ellen Greaves and Luke Sibieta Institute for Fiscal Studies Achieve Together Bring together three programmes in a school Teach First Teaching Leaders ID: 199035
Download Presentation The PPT/PDF document "© Institute for Fiscal Studies" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
© Institute for Fiscal Studies
Evaluation design for Achieve Together
Ellen Greaves and Luke SibietaSlide2
© Institute for Fiscal Studies
Achieve Together
Bring together three programmes in a school:Teach FirstTeaching LeadersFuture LeadersIntensive human capital investment
Original motivation was also to encourage schools to work together and to engage the local community and organisation in school-improvement
Cluster-design
Difficult to evaluate quantitatively
Evaluation and pilot funded by the Education Endowment Foundation (EEF)Slide3
Outline
The original design of the evaluationWhat went wrongDesign of the pilotRecruitment (round 1)
Recruitment (round 2)Final design of the evaluationLessons for evaluators
© Institute for Fiscal Studies Slide4
© Institute for Fiscal Studies
Achieve Together
Two pilots:Area-based designSchool-level human capital investmentSlide5
© Institute for Fiscal Studies
Achieve Together
Two pilots:Area-based designOne-cluster in Bournemouth
4 primary schools and 6 secondary schools
Involvement of local community/organisations
Process evaluation
School-level human capital investmentSlide6
© Institute for Fiscal Studies
Achieve Together
Two pilots:Area-based designSchool-level human capital investment
School-level intervention
No co-ordination within clusters or involvement of external organisations
Quantitative evaluation and process evaluationSlide7
© Institute for Fiscal Studies
Original evaluation design
Randomised controlled trialNumber of schools fixed by EEF: 24 treatment and 24 control Primary outcomesAttainment at KS4Attainment at Year 7 (focus of Achieve Together impact project)
Secondary outcomes
Number of persistent absentees
Overall absence rateSlide8
© Institute for Fiscal Studies
Original evaluation design
Randomised controlled trialNumber of schools fixed by EEF: 24 treatment and 24 control Primary outcomesAttainment at KS4
Attainment at Year 7 (focus of Achieve Together impact project)
Secondary outcomes
Number of persistent absentees
Overall absence rateSubgroupsPupils eligible for free school mealsPupils with low prior attainment“Business as usual” in control schools
Able to access one programme element of Achieve TogetherSlide9
Power calculations
0
0.1
0.2
0.3
0.4
0.5
Model 1
0.048
0.203
0.283
0.345
0.398
0.444
Model 2
0.052
0.220
0.306
0.373
0.430
0.480
Model 3
0.044
0.186
0.259
0.315
0.363
0.406
Note
: These calculations represent the effect size that will be possible to detect using a two-sided hypothesis test with significance level of 5%, and with power against an alternative hypothesis of 80%. Model 1 reports the minimum detectable effect size when the variance of the outcome unexplained by attributes of the pupils (including prior attainment) is 60%. Model 2 reports a less optimistic scenario (70% unexplained), whilst Model 3 is more optimistic (50% unexplained).
© Institute for Fiscal Studies Slide10
What went wrong: design of the pilot
School-level RCT began to look clustered...Cluster based recruitment
Co-ordination between schoolsComplicates and creates risks for evaluation:
What can we learn from the evaluation?
How will the power calculations be affected?
© Institute for Fiscal Studies Slide11
What went wrong: design of the pilot
School-level RCT began to look clustered...Cluster based recruitment
Co-ordination between schoolsComplicates and creates risks for evaluation:
What can we learn from the evaluation?
Is positive impact due to the human capital approach?
Or better co-ordination between schools?
Our findings would be inconclusiveHow will the power calculations be affected?
© Institute for Fiscal Studies Slide12
What went wrong: design of the pilot
School-level RCT began to look clustered...Cluster based recruitment
Co-ordination between schoolsComplicates and creates risks for evaluation:
What can we learn from the evaluation?
How will the power calculations be affected?
At the extreme, we can think of the unit of treatment as the cluster
Uncertain risk for the minimum detectable effect size
Required treatment effect from power calculations with clustering at the school level already looked ambitious...Clustering may increase the intra-cluster correlation and increase the challenge of detecting a significant effect
© Institute for Fiscal Studies Slide13
What went wrong: recruitment (round 1)
Target: 48Recruited: 13
Problems for recruitment:Time availableUncertainty about staff availability
Uncertainty about school budget (for costly programme)
Risk of being allocated to control group
Clarity about the pilot
The recruited schools began Achieve Together in September 2013
© Institute for Fiscal Studies Slide14
What went wrong: recruitment (round 2)
Target: 48Recruited: 15
Problems for recruitment:Time available
Uncertainty about staff availability
Uncertainty about school budget (for costly programme)
Risk of being allocated to control group
Clarity about the pilotThe recruited schools will begin Achieve Together in September 2014
© Institute for Fiscal Studies Slide15
Final evaluation design
Non-experimentalMatching (“well-matched comparison group”)
Similar in terms of observable characteristicsExpressed a strong interest in Achieve Together
How credible are the non-experimental estimates?
Depends on the factors that determine take-up and growth in pupil attainment - observable or unobservable?
Assess the credibility of the non-experimental matching estimates
Achieve Together round 1 schools: compare matching estimates to a “gold standard” comparison group - schools that are similar in both observable and unobservable characteristics
Achieve Together round 2 schools
© Institute for Fiscal Studies Slide16
Final evaluation design
Matching likely to be credible
Matching unlikely to be credible
© Institute for Fiscal Studies Slide17
Lessons for evaluators (1)
Evaluators must have good communication with the project teamHow are plans for the pilot developing?
What are the implications for the evaluation design?Why is the evaluation important?
Evaluators should be clear about the necessary requirements for the evaluation
What is expected of control schools?
Restrictions on “business as usual”
What is expected of treatment schools?Additional testing
Involvement with process evaluationWhat are non-negotiable elements of the evaluation
© Institute for Fiscal Studies Slide18
Lessons for evaluators (2)
Recruitment can be difficult!What barriers does the evaluation impose and can these be reduced?
Be creativeWhat evaluation design is feasible as circumstances change?
Be selective!
What is the potential for a robust and informative evaluation?
What are the risks to the evaluation?
© Institute for Fiscal Studies