/
Measuring Impact: Impact Evaluation Methods for Policy Makers Measuring Impact: Impact Evaluation Methods for Policy Makers

Measuring Impact: Impact Evaluation Methods for Policy Makers - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
425 views
Uploaded On 2018-03-11

Measuring Impact: Impact Evaluation Methods for Policy Makers - PPT Presentation

Paul Gertler UC Berkeley Note slides by Sebastian Martinez Christel Vermeersch and Paul Gertler The content of this presentation reflects the views of the authors and not necessarily those of the World Bank This version November 2009 ID: 647469

enrolled amp health impact amp enrolled impact health treatment case expenditures random program regression linear assignment counterfactual comparison matching

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Measuring Impact: Impact Evaluation Meth..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Measuring Impact:Impact Evaluation Methods for Policy Makers

Paul GertlerUC Berkeley

Note: slides by Sebastian Martinez, Christel Vermeersch and Paul Gertler. The content of this presentation reflects the views of the authors and not necessarily those of the World Bank. This version: November 2009.Slide2

2

Impact EvaluationLogical FrameworkHow the program works “in theory”Measuring ImpactIdentification StrategyData

Operational Plan

ResourcesSlide3

3

Measuring ImpactCausal InferenceCounterfactualsFalse Counterfactuals:

Before & After (pre & post)

Enrolled & Not Enrolled (apples & oranges)

IE Methods Toolbox:

Random Assignment

Random Promotion

Discontinuity Design

Difference in Difference (Diff-in-diff)

Matching (P-score matching)Slide4

4

Our ObjectiveEstimate the CAUSAL effect (impact) ofintervention P (program or treatment)on

outcome

Y

(indicator, measure of success)

Example: what is the effect of a

Health Insurance Subsidy Program

(P)

on

Out of Pocket Health Expenditures (Y)?Slide5

5

Causal InferenceWhat is the impact of P on Y? Answer:

α

= (

Y

|

P

=1

)-(

Y | P=0) Can we all go home?Slide6

6

Problem of missing dataFor a program beneficiary:

we observe (Y |

P=1

):

Health expenditures (Y) with health insurance subsidy (P=1)

but we do not observe (Y |

P=0

):

Health expenditures (Y) without health insurance subsidy (P=0)α= (Y | P=1)-(Y |

P=0)Slide7

7

SolutionEstimate what would have happened to Y in the absence of PWe call this the…………

COUNTERFACTUAL

The key to a good

impact evaluation is a valid

counterfactual!Slide8

8

Estimating Impact of P on YOBSERVE (Y | P=1)

Outcome with treatment

ESTIMATE

(Y |

P=0

) counterfactual

α

= (Y | P=1) - (Y | P=0)

IMPACT = outcome with treatment - counterfactual

Intention to Treat (

ITT

)

-Those offered treatment

Treatment on the Treated (

TOT

)

– Those receiving treatment

Use

comparison

or

control

groupSlide9

9

Example: What is the Impact of:

giving Fulanito

additional pocket money

(

P

)

on

Fulanito’s consumption of candies (Y)Slide10

10

The perfect “Clone”

6 Candies

Impact =

Fulanito

Fulanito’s Clone

4 CandiesSlide11

11

In reality, use statistics

Average Y = 6 Candies

Impact = 6 - 4 = 2 Candies

Treatment

Comparison

Average Y = 4 CandiesSlide12

12

Finding Good Comparison GroupsWe want to find “clones” for the Fulanito’s in our programsThe treatment and comparison groups should:have identical characteristics, except for benefiting from the intervention

In practice, use program eligibility & assignment rules to construct valid counterfactuals

With a good comparison group, the

only reason

for different outcomes between treatments and controls is the

intervention (P)Slide13

13

National Health System ReformClosing gap in access and quality of services between rural and urban areasLarge expansion in supply of health servicesReduction of health care costs for rural poor

Health Insurance Subsidy Program (HISP)

Pilot program

Covers costs for primary health care and drugs

Targeted to poor – eligibility based on poverty index

Rigorous impact evaluation with rich data

200 communities, 10,000 households

Baseline and follow-up data two years later

Many outcomes of interestYearly out of pocket health expenditures per capitaWhat is the effect of

HISP (P) on

health expenditures

(

Y

)?

If impact is a reduction of

$9

or more, then scale up nationally

Case Study: HISPSlide14

14

Ineligibles

(Non-Poor)

Eligibles

(Poor)

Case Study: HISP

Not Enrolled

Enrolled

Eligibility and EnrollmentSlide15

15

Measuring ImpactCausal InferenceCounterfactualsFalse Counterfactuals:

Before

& After (pre & post)

Enrolled &

Not enrolled

(apples & oranges)

IE Methods Toolbox:

Random Assignment

Random PromotionDiscontinuity DesignDifference in Difference (Diff-in-diff)Matching (P-score matching)Slide16

16

Counterfeit Counterfactual #1Before & After

Y

Time

T=0

Baseline

T=1

Endline

IMPACT?

B

A

C (counterfactual)Slide17

17

Case 1: Before & AfterObserve only beneficiaries (P=1)2 observations in time

expenditures at T=0

expenditures at T=1

“Impact” = A-B =

Time

What is the effect of

HISP (P)

on

health expenditures (Y)

?

B

T=0

T=1

Y

7.8

14.4

A

α

= Slide18

18

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). 

Outcome with Treatment

Counterfactual

Impact

 

(After)

(Before)

(Y | P=1) -

(Y | P=0)

health expenditures (Y)

7.8

14.4

-6.6

**

 

Linear Regression

Multivariate Linear Regression

estimated impact

on

health expenditures (Y)

-

6.59**

-6.65**

Case 1: Before & AfterSlide19

Economic Boom:

Real Impact = A-CA-B is an underestimateEconomic Recession: Real Impact = A-

D

A-

B

is an

overestimate

Time

B

T=0

T=1

Y

7.8

14.4

A

α

= -$6.6

D

?

C?

Impact ?

Case 1: What’s the Problem?

Impact ?

Problem with before & after: doesn’t control for other time-varying factors!Slide20

20

Measuring ImpactCausal InferenceCounterfactualsFalse Counterfactuals:

Before & After (pre & post)

Enrolled & Not Enrolled (apples & oranges)

IE Methods Toolbox:

Random Assignment

Random Promotion

Discontinuity Design

Difference in Difference (Diff-in-diff)

Matching (P-score matching)Slide21

21

False Counterfactual #2Enrolled & Not EnrolledIf we have post-treatment data on

Enrolled: treatment group

Not-enrolled: “control” group (counterfactual)

Those

ineligible

to participate

Those that

choose NOT

to participateSelection BiasReason for not enrolling may be correlated with outcome (Y)Control for observablesBut not unobservables!!Estimated impact is confounded with other thingsSlide22

22

Ineligibles

(Non-Poor)

Eligibles

(Poor)

Measure outcomes in post-treatment (T=1)

In what ways might enrolled & not enrolled be different, other than their enrollment in the program?

Not Enrolled

Y = 21.8

Enrolled

Y = 7.8

Case 2: Enrolled & Not EnrolledSlide23

23

Case 2: Enrolled & Not Enrolled

Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

 

Outcome with Treatment

Counterfactual

Impact

 

(Enrolled)

(Not Enrolled)

(Y | P=1) -

(Y | P=0)

health expenditures (Y)

7.8

21.8

-14**

 

Linear Regression

Multivariate Linear Regression

estimated impact

on

health expenditures (Y)

-

13.9**

-9.4**Slide24

24

Will you recommend scaling up HISP?Before-After:Are there other time-varying factors that also influence health expenditures?Enrolled-Not Enrolled:Are reasons for enrolling correlated with health expenditures?

Selection Bias

Policy Recommendation?

Case 1: Before and After

Case 2: Enrolled & Not-Enrolled

 

Linear Regression

Multivariate Linear Regression

Linear Regression

Multivariate Linear Regression

impact

on

health expenditures (Y)

-

6.59**

-6.65**

-

13.9**

-9.4**Slide25

25

Keep in mind……..Two common comparisons to be avoided!!Before & After (pre & post)Compare: same individuals before and after they receive

P

Problem: other things may have happened over time

Enrolled & Not Enrolled

(apples & oranges)

Compare: a group of individuals that enrolled in a program with a group that

chooses

not to enroll

Problem: Selection Bias  we don’t know why they are not enrolledBoth counterfactuals may lead to biased estimates of the impactSlide26

26

Measuring ImpactCausal InferenceCounterfactualsFalse Counterfactuals:

Before & After (pre & post)

Enrolled & Not Enrolled (apples & oranges)

IE Methods Toolbox:

Random Assignment

Random Promotion

Discontinuity Design

Difference in Differences (Diff-in-diff)

Matching (P-score matching)Slide27

27

Choosing your IE method(s)…..Key information you will need for identifying the right method for your program: Prospective/retrospective evaluation?Eligibility rules and criteria?

Poverty targeting?

Geographic targeting ?

Roll-out plan (pipeline) ?

Is the number of eligible units larger than available resources at a given point in time?

Budget and capacity constraints?

Excess demand for program?

Etc….Slide28

28

Choosing your IE method(s)…..Best design = best comparison group you can find + least operational riskHave we controlled for “everything”?Internal validityGood comparison group

Is the result valid for “everyone”?

External validity

Local versus global treatment effect

Evaluation results apply to population we’re interested in

Choose the

“best” possible design given

the operational contextSlide29

29

Measuring ImpactCausal InferenceCounterfactualsFalse Counterfactuals:

Before & After (pre & post)

Enrolled & Not enrolled (apples & oranges)

IE Methods Toolbox:

Random Assignment

Random Promotion

Discontinuity Design

Difference in Differences (Diff-in-diff)

Matching (P-score matching)Slide30

30

Randomized Treatments and ControlsWhen universe of eligibles > # benefits:Randomize! Lottery for who is offered benefitsFair, transparent and ethical way to assign benefits to equally deserving populations

Oversubscription:

Give each eligible unit the same chance of receiving treatment

Compare those offered treatment with those not offered treatment (controls)

Randomized phase in:

Give each eligible unit the same chance of receiving treatment first, second, third….

Compare those offered treatment first, with those offered treatment later (controls)Slide31

31

Randomized treatments and controls

1. Universe

2. Random Sample of Eligibles

Ineligible =

Eligible =

3. Randomize Treatment

External Validity

Internal Validity

ControlSlide32

32

Unit of RandomizationChoose according to type of program:Individual/HouseholdSchool/Health Clinic/catchment areaBlock/Village/CommunityWard/District/RegionKeep in mind:

Need “sufficiently large” number of units to detect minimum desired impact

 power

Spillovers/contamination

Operational and survey costs

As a rule of thumb, randomize at the smallest viable unit of implementationSlide33

Health Insurance Subsidy Program (HISP)Unit of randomization: Community200 communities in the sampleRandomized phase-in:

100 treatment communities (5,000 households)Started receiving transfers at baseline T = 0100 control communities (5,000 households)Receive transfers after follow up T = 1 if program is scaled up

Case 3: Random Assignment

33Slide34

34

T=0100 TreatmentCommunities(5,000 HH)

100

Control Communities

(5,000 HH)

T=1

Time

Comparison period

Case 3: Random AssignmentSlide35

35

How do we know we have good clones?

Case 3: Random AssignmentSlide36

36

Case 3: Random Assignment

Control

Treatment

T-stat

Health Expenditures

($

yearly per capita)

14.57

14.48

-0.39

Head’s age (years)

42.3

41.6

1.2

Spouse’s age (years)

36.8

36.8

-0.38

Head’s education (years)

2.8

2.9

-

2.16**

Spouse’s education (years)

2.6

2.7

-0.006

**= significant at 1%

Case 3: Balance at BaselineSlide37

37

Case 3: Random Assignment

Control

Treatment

T-stat

Head is female = 1

0.07

0.07

0.66

Indigenous =1

0.42

0.42

0.21

Numer of household members

5.7

5.7

-1.21

Bathroom =1

0.56

0.57

-1.04

Hectares of Land

1.71

1.67

1.35

Distance to hospital (km)

106

109

-1.02

**= significant at 1%

Case 3: Balance at BaselineSlide38

38

Case 3: Random AssignmentNote: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

 

Treatment Group

Counterfactual

Impact

 

(Randomized

to treatment

)

(Randomized to

c

omaparison

)

(Y | P=1

)-

(

Y | P=0)

Baseline

(T=0)

health expenditures (Y)

14.48

14.57

-0.09

Follow-up

(T=1)

health expenditures (Y)

7.8

17.9

-10.1**

 

Linear Regression

Multivariate Linear Regression

estimated impact

on

health expenditures (Y)

-10.1**

-10**Slide39

39

**= significant at 1%

HISP Policy Recommendation?

Case 1: Before and After

Case 2: Enrolled & Not-Enrolled

Case 2: Enrolled & Not-Enrolled

Case 3: Random Assignment

 

Multivariate Linear Regression

Linear Regression

Multivariate Linear Regression

Multivariate Linear Regression

impact of HISP on health expenditures (Y)

-6.65**

-

13.9**

-9.4**

-10**Slide40

Random Assignment:With large enough samples, produces two groups that are statistically equivalentWe have identified the perfect “clone”

Feasible for prospective evaluations with over-subscription/excess demand Most pilots and new programs fall into this category!40

Keep in mind……..

Randomized beneficiary

Randomized comparisonSlide41

41

Remember…..Objective of impact evaluation is to estimate the CAUSAL effect or IMPACT of a program on outcomes of interestTo estimate impact, we need to estimate the

counterfactual

What would have happened in the absence of the program

Use comparison or control groups

We have toolbox with 5 methods to identify good comparison groups

Choose the best evaluation method that is feasible in the program’s operational contextSlide42

42

THANK YOU!