Impact evaluation

An impact evaluation can help you establish whether your program is achieving the outcomes it is aiming for. Impact evaluations can also help you understand what makes your program work: Which elements of the program drive outcomes? Are there specific groups of people for whom the program works (better)? Does the program work in new contexts?

Impact evaluations determine program effectiveness by asking the question: what would have happened in the absence of my program? They allow you to conclude with reasonable confidence that your program caused the outcomes you are seeing. Although we can never observe what happens for the same set of people both with and without a program, we can use different techniques of generating a comparison group to construct what likely would have happened – and this process allows us to attribute causality to your program.

There are many ways of conducting an impact evaluation, based on how you estimate what would likely have happened in the absence of your program. Click here to learn more about the different ways of conducting an impact evaluation, from randomized controlled trials to quasi-experimental approaches.

Example

Say you conduct an impact evaluation of a vocational training program. Six months after finishing the program, a majority of participants were employed. When you compare this group to a similar group of people who did not receive the program, however, you realize that the increase in employment was similar. The impact evaluation thus revealed that your program did not cause improved employment outcomes. As you look for other explaining factors, you realize your program coincided with economic growth and job creation – which could have caused the positive employment outcomes you are seeing.

Which questions can an impact evaluation answer?

Is my program having an impact?

An impact evaluation reveals whether your program is achieving the desired outcomes. Evidence of high impact usually indicates that your program should be scaled up. If your impact evaluation shows that your program does not have any impact or has lower impact than expected, you can use this information to refine your program. For example, if you find that your vocational training program does not cause improved job outcomes, you would want to consider redesigning and retesting the program, or possibly terminating the program altogether.

Impact evaluations can be used in two ways: to create knowledge about what works, and to inform specific decisions about your program. For example, you may conduct an impact evaluation of your vocational training program to understand whether the program leads to long-term full-time employment. You could do this to generally understand the impact of vocational programs, or specifically to inform your decision of whether to scale the training program to another state.

You can also use an impact evaluation to provide accountability to donors and secure additional funding. Showing that your program works would help you demonstrate to your donors that the funding they provided led to positive impact. It would also provide credibility for you to secure additional funding to expand your program.

Sometimes you need evidence of your program's impact to secure funding,

but your program may not be ready for an impact evaluation. Using another method can give you evidence that your program is moving in the right direction. Use the decision guide to find the right method for your program.

Which version of my program is more effective?

‍Impact evaluations can be used to compare the impact of different programs and determine which is the most effective. This can be useful in situations where you need to figure out if a more intensive version of a program is worth the extra expense, or when you’ve had several successful pilot programs and need to choose between them. You can vary a particular component of your program and compare outcomes across versions to determine which one is most effective at achieving your goal (this method is often referred to as A/B testing). Such impact evaluations can help you make decisions about program elements in a short timeframe.

Is my program targeting the right population?

Conducting a needs assessment when your program is already up and running can help you assess whether you are reaching the people you need to. Continuing from a previous example on parental perceptions affecting school attendance, let’s say you observe after a couple years that attendance in early grades has improved dramatically, but many older children are dropping out between primary and secondary school. You realize that the needs of your target population have shifted, and shift the focus your program to retention and remedial education, rather than convincing parents to send their children to school in the first place.

Next steps - Conducting an impact evaluation

Conducting an impact evaluation involves the following steps:

Developing an evaluation design that is feasible given the context of your program. You need to identify a comparison group that doesn’t receive the program, and is unlikely to accidentally avail of the benefits of the program
Offering the program to a set of people distinct from the comparison group
Allowing enough time to pass for the outcomes you expect to manifest
Collecting data on outcomes. This will most probably be through a survey, although it is possible to use administrative data
Comparing outcomes across the comparison group and the group that received the program will reveal the effect of the program

Should you do the impact evaluation yourself or hire an expert?

Under most circumstances, organizations opt to work with an external evaluator for an impact evaluation. The main considerations for deciding whether to do the impact evaluation yourself or work with an expert are:

1. Technical capacity

Conducting an impact evaluation requires technical capacity to develop a research design and data collection tools, and the ability to perform statistical analyses. It is also critical to ensure unbiased data collection. Most organizations cannot guarantee unbiased data collection or don’t have the capacity to conduct such a complicated analysis in-house, and usually opt to work with an external organization to conduct all or part of the evaluation. Should you choose to collect data internally, you should attempt to minimize response bias by hiring external enumerators, especially if the evaluation is asking sensitive personal questions.

2. Purpose of the impact evaluation

If the purpose of your impact evaluation is accountability to external constituents like funders or taxpayers, you almost certainly want an independent evaluator (regardless of your internal capacity) so that the evaluation has sufficient credibility.

If you are working with an external evaluator, it is a good idea to familiarize yourself with key concepts and terms in impact evaluations, as listed in the Guide to Impact Evaluation on page 16 of this Impact Evaluability Toolkit from J-PAL and CLEAR South Asia.

How to conduct your impact evaluation

The other methods covered in the Impact Measurement Guide can help you identify problems, check whether you’re on the right path, and build a base of evidence that your program is likely having the positive impact you set out to achieve. Only an impact evaluation, however, can provide rigorous evidence of the size of that impact, and how much your program is responsible for it. While data on community needs and program performance is important and useful, there are situations when an impact evaluation is absolutely the right tool for the job. For example, evidence of impact may help you feel confident enough to expand your program’s reach, decide between different versions of a program, or unlock more funding.

These are all high-stakes decisions – and impact evaluations often feel like high-stakes activities! They are certainly a bigger investment of time and resources than many of the other methods covered here. Therefore, it is critical to be clear on the purpose of the impact evaluation before forging ahead. Which decisions will you make on the basis of the impact evaluation?

Then, you need to make sure your program is well-implemented. If the program has significant implementation flaws, an impact evaluation is likely to provide disappointing, confusing results. You won’t be sure whether the program’s theory of change just doesn’t hold up, or whether the program could have been successful with better execution.

Once you are confident you need an impact evaluation and that your program is well-run, you should also consider what to measure and how you will measure it. All these decisions will affect the evaluation design. Even if you are working with an external evaluator, you should plan to be involved in these conversations as they are driven by your program’s context.

Before you conduct an impact evaluation

We suggest you work through the following checklist with your team and the evaluator. This will help you design an impact evaluation that answers your questions while also keeping your constraints in mind:

1. What is the purpose of the impact evaluation?

Which decisions will your organization make on the basis of an impact evaluation? How will you use the results? Which people do the results need to convince?

The answers to these questions will help you define the research question for the impact evaluation. This exercise will also help you identify the level of rigor you need and timeline you are bound to. You may also learn that an impact evaluation is not suited to answering your question –for example, it is difficult to evaluate systems-wide change, or a program that is very small.

We also recommend going through the decision guide to make sure an impact evaluation is what you truly need.

2. Is the program well-implemented?

Knowing your program is run well will make your impact evaluation more credible. For example, if the impact evaluation reveals that your program has low impact, you can have greater confidence that you need to update the theory of change, rather than make superficial changes to improve implementation.

You can check the implementation of your program through a monitoring system or a process evaluation:

A monitoring system collects real-time data about your program’s performance. You can use data from your monitoring system to know whether your program is performing as expected.
A process evaluation is a one-time exercise comparing implementation to the steps laid out in your theory of change and implementation plan. You can conduct a process evaluation before or alongside an impact evaluation.

Conducting a process evaluation beforehand enables you to identify and fix issues before your impact evaluation. This way, you will know that your impact evaluation is measuring the impact of a well-run program.
Conducting a process evaluation alongside an impact evaluation allows you to identify explanations for the results you find – whether the results you see are because of implementation processes (process evaluation) or because assumptions for success do or do not hold (impact evaluation).

3. Should we measure intermediate outcomes or final outcomes?

When designing an impact evaluation, it’s important to be thoughtful about whether to measure outcomes or outputs in your theory of change. In most cases, the outputs delivered by your program, such as number of trainings delivered or bed nets distributed, don’t represent real changes in outcomes, such as increased earnings – you would likely need to measure the outcomes.

However, it may be sufficient to measure outputs in cases where it is well-established that the output causes the outcome (e.g. vaccination programs), or where the program goal is related to outputs and not outcomes. For example, if your program aims to get more children to attend preschool, the impact evaluation would measure the effectiveness of the program at increasing attendance in preschool (output), and not the impact of preschool on children’s learning levels (final outcome).

4. What should the unit of randomisation be?

The “unit” is the level at which you assign groups for treatment or comparison. This can be individuals, schools, villages, etc. The unit you choose will affect the level at which you can measure outcomes – if you would like to measure the impact of a teacher training program on school rankings, you would need to choose schools as your unit, not individuals.

The choice of a unit depends on the indicator you want to measure, the possibility of people from your comparison group being exposed to your program (“spillovers”), the likelihood of participants dropping out of the sample, statistical power, and operational feasibility. For more information, please see Running Randomized Evaluations, Module 4.2: Choosing the Level of Randomization.

5. Can we randomize?

While randomized controlled trials are the gold standard for impact evaluations, it is not always possible to randomize, for logistical, budgetary, or ethical reasons. You can consider a nonrandomized, “quasi-experimental” impact evaluation in such cases. Have a look at our guidance on types of impact evaluations to decide whether you can randomize access to your program.

6. How many participants should our study have?

The units on which your program is delivered- to individuals, schools or hospitals, or entire regions at a time?- is one of the determinants of the number of participants (“sample size”) for your study. Impact evaluations generally require a large number of units, and the exact number required will depend on the evaluation method you’re using and the size of the impact you expect to find. Picking up subtle changes (such as a small boost in test scores) requires a bigger sample than detecting large changes (such as mortality rates falling by half). The sample size will have major implications for the cost of the impact evaluation.

For more information on sample size calculations, please refer to J-PAL’s guidance on conducting power calculations.

7. Which data sources are available for the impact evaluation?

You will either collect data yourself or use data that’s already being collected for operational reasons. We call the latter “administrative data.” Generally, using administrative data will save you time and money. However, it is important to be mindful of the data quality and availability and the information captured by it when using this source. If administrative data is available for your study, have a look at J-PAL’s guidance on using administrative data to see if it is .

8. What is the timeline we are working with?

The timeline of the impact evaluation should leave enough time for outcomes to manifest. For example, if your program works to increase learning levels, it’s unlikely you’ll see a change in test scores until a few months later – and your data collection timelines should account for this.

You can potentially shorten the evaluation timeframe by measuring intermediate outcomes, in cases where there’s a well-established link between the intermediate and final outcome. For example, if you are evaluating the impact of polio vaccinations, you can measure polio vaccine take-up rather than incidence of disease because it is already well-established that polio vaccinations prevent polio.

The timeline of the impact evaluation will also depend on the purpose of the impact evaluation – you may need to shorten the evaluation timeframe if you need the results soon in order to make a major program decision.