Impact Measurement Guide Beta

Evidence reviews

An evidence review is a critical review of secondary sources related to your program. It helps you understand what information is available about the problems you’re trying to address and your program design: What interventions have already been tried? In what context? Were they successful? Why or why not?

An evidence review helps you build on top of programs with a proven track record of effectiveness, leverage existing resources, and avoid strategies that have already been proven ineffective – saving you time and resources in the long run. It can also strengthen your theory of change by helping you understand what assumptions are reasonable, given the existing evidence base.


Say you want to start a malaria prevention program. You are wondering whether to focus on spraying pesticide to kill mosquitoes or distributing insecticide-treated bed nets. You conduct an evidence review, and learn that multiple studies have confirmed that sleeping under bed nets leads to decreased rates of malaria. You also note that the contexts in which these studies took place are similar to the context where you plan to implement the program. Based on this evidence, you decide that your program will distribute bed nets. 

Which questions can an evidence review answer?

An evidence review can save you significant time and effort by helping you build on previous successes and avoid past failures. You can model your program after what has worked in similar contexts, or avoid programs that have not been impactful.

Even if your program includes proven interventions, you’ll still need to ensure they are implemented well and that they are appropriate for your context:

  • If you cannot deliver the program well, it would not matter if the program has a track record of effectiveness. For example, if, based on an evidence review, you make low-interest loans available to women, but program staff forge records and pocket the money themselves, you will not see the expected impact
  • A program that was successful elsewhere may fail in your context if key conditions are different. For example, a program that built schools in rural Afghanistan was found to be highly effective in raising student test scores.[1] But the same program delivered in Peru or India, where most rural communities already have schools, would be unlikely to effect change.  

An evidence review can tell you which components of a program drive its impact. You can focus your efforts to ensure that these parts are working well in your program. For example, say you are providing treatment for tuberculosis. Your evidence review shows that while antibiotics are a proven treatment, patients not adhering to the medication regimen is a big problem. Based on this, you may decide to monitor patients’ treatment adherence.

 [1] Burde, D., & Linden, L. L. (2013).Bringing education to Afghan girls: A randomized controlled trial ofvillage-based schools. American Economic Journal: Applied Economics, 5(3),27-40.

Next steps - Conducting an evidence review

There are two steps for conducting an evidence review:

  1. Searching for relevant sources
  2. Reading through, understanding, and summarizing the research.

Should you do the evidence review yourself or hire an expert?

Most organizations can do an evidence review on their own. You may want to hire an external party to do an evidence review for the following reasons:

  • Your team does not have the time to search for and read through the relevant sources
  • Your team does not have prior experience consuming research
  • You would like a more objective summary from someone who is not heavily involved in program implementation

How to conduct your evidence review

Step-by-step instructions and template assessment

There are three major steps for conducting an evidence review:

You first need to make a list of evidence that looks like it might be relevant. Later, you’ll go through this longlist in more depth to identify what is actually useful for you. This “longlist of evidence” is called an annotated bibliography.

In this step, we’re going to build an annotated bibliography. You’ll end up with a list of all sources you considered and skimmed, along with your quick takeaways for each source.

We suggest making the annotated bibliography in a spreadsheet using Google Sheets or Excel – have a look at this template to see which fields you should include. You may also use bibliography generators like ZoteroMendeley for this purpose.

Now, onto finding sources for your bibliography. You can follow either (or both) of these two ways to identify evidence to add to the annotated bibliography:

We suggest reading through both methods before deciding which one you plan to implement.

Now that you have your annotated bibliography, it’s time to start going through the sources listed. You’ll ultimately synthesize your findings across sources and analyze their implications for your program. We suggest starting a Word document for recording your takeaways and beginning the synthesis work.

For each source in your bibliography, add useful information to your Word document after thinking critically about:

1.     The quality of evidence

GRADE provides a standard framework for assessing whether the quality of the evidence is high, moderate, low, or very low.

 Other considerations can include:

  • Topic: What is the source about? How similar is the program(s) described in the source? Is it implemented at the same level – individual, household, village, school, etc.?
  • Geography: Region or country, developing or high-income countries 
  • Type of study: Randomized control trial, representative survey, observational study, meta-analysis
  • Results: What did the study find? 
  • Limitations: Are the any issues/drawbacks from the study? For example, estimates may be biased due to non-experimental methods, or non-random samples. Estimates could also be imprecise due to small sample size
  • Source: Peer-reviewed journal, policy brief, research report

 2.     Whether the evidence is likely to apply to your context

The generalizability framework proposed by J-PAL is an excellent tool for understanding whether evidence from another context would apply well to your program.

Some general questions to consider are – did these studies measure similar outcomes? Were they conducted in a similar context (geography, target population, time period)? Were the programs similar to your program? What other contextual aspects make you confident in the validity of the study to the client context?

3.     How the evidence helps you understand your program

Usually, useful evidence will do one or more of the following:

  • Provide the scope of the problem, through statistics about its prevalence and case studies that indicate why it needs to be addressed

For example: Learning outcomes in government schools in Bihar are poor and stagnant, resulting in students being inadequately prepared for the job market.  

  • Enumerate the major constraints to solving the problem

For example: Supply-side constraints (physical inputs, teachers, administration, rigid curriculum and automatic promotion to next grade) and demand-side constraints (beliefs about the returns to education, opportunity costs of keeping children in schools)hinder progress on learning outcomes.

  • Enumerate major attempts to solve this problem and how successful they have been

For example: Attempts to improve learning outcomes have either been school-focused (teacher motivation, facility improvement) or community-focused (information campaigns).While teacher motivation campaigns have been successful at improving learning outcomes, the frequency of the campaign and channels used to reach teachers greatly affect the success rate.

  • Illuminate some of the links and assumptions in your theory of change

For example: Student enrollment for this district is 98%, but attendance is only 43%, so our program targets attendance

Now that you have reviewed the evidence, you should need to write out your findings.

Your evidence review should lay out your proposed program, findings from previous studies, and gaps:

Your proposed program

  • Explain the theory of change for this approach to solving the policy problem. What are the major links and assumptions?
  • Within the context of the theory of change, illuminate some of the links and assumptions. If possible, use data and qualitative evidence to strengthen the theory of change (e.g. 98% of children are enrolled in school, but average attendance is only 43%, so the program targets attendance).

Findings from previous studies

Describe previous studies that have evaluated similar programs. How compelling is the evidence from these studies? Evaluate these studies keeping the following considerations in mind:

  • Relevance to your program: Did these studies measure similar outcomes and were they conducted in a similar context (geography, target population, time period)? How similar is the program? What other contextual aspects make you confident in the validity of the study to your program’s context?
  • Rigor: Are the estimates from other studies statistically significant? Precise? How robust was the methodology to potential bias? Was high-quality data analyzed?


Describe the major gaps that remain in the theory of change, particularly any links or assumptions that lack evidence from a similar context to your program.


Not sure where to go from here? 
Use our guide to frame a question and match it to the right method.

Evidence review case study

How a needs assessment enabled a nonprofit in Senegal to validate that they were addressing an important problem

Types of impact evaluations

Impact evaluations compare the people who receive your program to a similar group of people who did not receive your program. Based on how this similar group is chosen, impact evaluations can be randomized controlled trials or quasi-experimental.

A randomized controlled trial (RCT) is considered the gold standard of impact evaluation. In an RCT, the program is randomly assigned among a target population. Those who receive the program are called the treatment group; those who do not receive the program are called the control group. We consider the outcomes from the control group to represent what would have happened to the treatment group in the absence of the program. By comparing outcomes among those who receive the program (the treatment group) to those who don’t (the control group), we can rigorously estimate the impact of the program. The random assignment of the program to the treatment and control group provides the rigor, as it ensures that the selection of people is not based on biased criteria that could affect the results.

When a randomized design is not feasible, there are other, “quasi-experimental,” ways of constructing a valid comparison group. 

  • In a matched design we would match individuals who receive the program to individuals who don’t receive the program based on some observable characteristics (such as age, gender, number of years of schooling, etc.), and compare outcomes across these groups. 
  • Another common technique is regression discontinuity design, in which you create a cutoff based on which individuals are eligible to receive the program, and then compare outcomes from groups just below and just above the cutoff to estimate impact. 

Matched designs and regression discontinuity designs are just two of many quasi-experimental techniques. J-PAL provides an overview of common methods of conducting an impact evaluation. All such methods seek to identify what would have happened to your target population if they had never received the program, and their success relies on the strength of the assumptions they make about whether the comparison group is a credible stand-in for your program’s target population. 

Recommendation: Theory of Change

A theory of change is a narrative about how and why a program will lead to social impact. Every development program rests on a theory of change – it’s a crucial first step that helps you remain focused on impact and plan your program better.

You can use to create your Theory of Change.

Recommendation: Needs assessment

A needs assessment describes the context in which your program will operate (or is already operating). It can help you understand the scope and urgency of the problems you identified in the theory of change. It can also help you identify the specific communities that can benefit from your program and how you can reach them.

Once you’re satisfied that your program can be implemented as expected:

Your program's theory of change

You have a written plan for how your program will improve lives – great! Make sure to refer to it as you explore the different sections in the Impact Measurement Guide, as it is the foundation for any other method you’ll use.

Recommendation: Process Evaluation

A process evaluation can tell you whether your program is being implemented as expected, and if assumptions in your theory of change hold. It is an in-depth, one-time exercise that can help identify gaps in your program.

Once you are satisfied that your program can be implemented as expected:

Recommendation: Evidence Review

An evidence review summarizes findings from research related to your program. It can help you make informed decisions about what’s likely to work in your context, and can provide ideas for program features.

Once you are satisfied with your evidence review:

Recommendation: Monitoring

A monitoring system provides continuous real-time information about how your program is being implemented and how you’re progressing toward your goals. Once you set up a monitoring system, you would receive regular information on program implementation to track how your program is performing on specific indicators.

Once you are satisfied with your monitoring system:

Recommendation: Monitoring

A monitoring system provides continuous real-time information about how your program is being implemented and how you’re progressing toward your goals. Once you set up a monitoring system, you would receive regular information on program implementation to track how your program is performing on specific indicators.

Once you are satisfied with your monitoring system:

How to compile evidence, method 2

You’re on this page because you want to search for evidence relevant to your program.

Here are some academic sources where you can search for relevant research:

  • Google Scholar is a search engine for academic papers – enter the keywords relevant to your program, and you’ll find useful papers in the top links
  • Once you identify some useful papers, you can consult their literature review and bibliography sections to find other papers that might be relevant
  • Speaking to a sector expert can guide you to useful literature

 However, don’t include only academic studies in your review! You should also consult:

  • Policy reports
  • Websites of organizations involved in this issue, such as think tanks, NGOs, or the World Bank
  • Public datasets
  • Your program archives – data and reports from earlier iterations of the program can be very valuable!

 The free Zotero plug-in provides an easy way to save, organize, and format citations collected during internet research. Note that Zotero can help you start your annotated bibliography, but it is not a substitute since it does not include any summary or interpretation of each study’s findings.

How to compile evidence, method 1

Start with the 3ie Development Evidence Portal, which has compiled over 3,700 evaluations and over 700 systematic evidence reviews. Steps 2-7 of this example are specific to locating evidence on the 3ie website, but you can also consider looking for a review by J-PAL or Campbell Collaborations or the Cochrane.

For example, suppose your goal is to increase immunization rates in India. Type “immunization vaccination” or other related terms into the search box, and click the magnification lens to search.

The search results include individual studies, which are usually about a single program in a single location, as well as “systematic reviews”, which is what we are looking for because they are more comprehensive. To show only the systematic reviews, on the left of the screen under Filter Results, click on PRODUCTS and check the Systematic Reviews box. We’re now left with 17 evidence reviews related to immunization.

Now you might want to further narrow your search by region or country. In our example, suppose we want to see only those evidence reviews that contain at least one study from India. Click on COUNTRY and scroll down to click on India.

There are still 9 evidence reviews! Now read the titles of each review and start going through the ones that seem applicable to you.

  • Note that they are sorted with the most recent first, which is helpful as newer reviews tend to be more comprehensive.
  • 3ie has made the hardest part – assessing how strong the evidence is – easy for us. They use a 3-star scale to indicate the level of confidence in the systematic review.
  • In this example, the most recent review, Interventions For Improving Coverage Of Childhood Immunisation In Low- And Middle-Income Countries, is also the only one rated 3 stars (high quality). Click on its title.

The next page gives you an overview of the study. If it is “Open access”, this means you can read it for free – click “Go to source” below the star rating. If it isn’t open access, you can try some of the strategies in Step 9 to see if you can find the study for free elsewhere.

Clicking on “Go to source” opens a new tab with a PDF of the article. Don’t be intimidated by the length and technical terminology, and start with the summary – these articles usually include an “abstract” and sometimes a “plain language summary” and/or “summary of findings”.

The summary will likely be useful but too vague – dig into the review and look for details about which programs were tried and where, and how well they worked.

  • Keep track of which programs and studies seem particularly relevant, so that you can look them up later.
  • Consider the conclusions of the authors of the systematic review – are there trends that emerge across countries and contexts that are relevant for you? Overall, what are the main lessons from this review that you take away?
  • Be sure to read with a skeptical mindset – just because something worked in Japan doesn’t mean it will work in India – nor does it mean it won’t. And just because something worked in India before doesn’t mean it will continue to work – context is more complicated than country! Think about the evidence you find as well-documented ideas, but not the last word.
  • You can skip the parts about the methodology followed by the authors of the systematic review.
  • If the systematic review was helpful, add it to the bibliography.
  • Copy the citation from the references section and paste it into a search engine.
  • Usually, the first result will be the full paper on a journal’s website. If it is open access, you can read it directly. A lot of academic literature is not open access, unfortunately. Here are some tricks you can try if the article is not available:

a) Go back to your search and see if you can find a PDF posted on one of the authors’ websites – authors often share “working papers”, which might differ only slightly from the final paper, for free on their site.

b) Email the paper’s authors if you can’t find it elsewhere – many researchers are happy to share a copy with people looking to learn from their experience.

  • Read the paper. You probably don’t need to read all of it – the abstract, introduction, a description of the program, the results, and the conclusion are probably enough, whereas sections on technical methods can be skipped. This paper may also include references to other literature that could be relevant to your program. Keep track of these references so you can look into them later.
  • Add the paper to the annotated bibliography if it seems relevant.

For each piece of evidence that you find, there should be a clear justification for including it in the bibliography, such as: it is a landmark study in the topic (i.e. it has a large number of citations or is cited by many other studies in your review), it is relevant to specific aspects of this evaluation (such as measuring similar outcomes, being conducted in a similar context, or evaluating a similar intervention), etc. However, there are no absolute standards for inclusion, and since not all studies will be used in writing up the review, it is better to err on the side of including a study in the annotated bibliography.

Repeat step 9 for every paper from the systematic review that you found relevant!

  • If an organization ran the program that was evaluated, you might be able to find information on the organization’s website. If not, try emailing the organization.
  • Sometimes people share program details elsewhere – blogs, policy briefs, videos, etc. Try searching more about the program and you might find something.
  • In our example, we found a review from 2016, which is quite recent. However, keep in mind it can take 1-2 years to write and publish a review – and since the review is citing only published literature, the cited articles would be a year or two old as well. This means that it is likely that this review is missing anything done since 2014 or 2013, and hopefully the world has learned a lot about how to address your problem since then. While this shortcut helped you find relevant evidence, you might still want to use Method 2 so that you can see what has been learned since 2014.


  • This method used only academic sources. Non-academic sources are also a very useful source of information, and you should look into them. In Method 2, we have included a list of non-academic sources to consult.

Process evaluation vs monitoring: Which one do you need?

Process evaluations and monitoring both provide information on how your program is running and whether it is meeting expectations. The key difference is that process evaluations are a one-off activity, while monitoring is ongoing. That means that process evaluations are often more intensive exercises to collect more data and dive deeper into the theory of change. In contrast, ongoing monitoring must not overburden program staff and often tracks just a few high-priority indicators that are critical to program success.  

Consider the following questions to help you decide between conducting a process evaluation and building a monitoring system:  

1.     Are you interested in identifying specific problems or general problems? 

Process evaluations typically identify general or systemic problems along the theory of change, whereas monitoring typically identifies specific entities (e.g. service providers or locations)that need more attention.    

2.     Are you looking to hold program staff accountable?   

Both process evaluations and monitoring are implemented for learning – is our program being implemented as planned? If not, at which steps is it breaking down? However, if you are seeking an accountability system, monitoring is better-suited as it is continuous, whereas a process evaluation is a one-time exercise.  

3.     Do you need ongoing data on how your program is performing?  

A process evaluation typically offers a snapshot in time, whereas monitoring involves ongoing data collection and analysis for the entire duration of the program. For example, a process evaluation may do in-depth interviews with program participants on their experiences, whereas a monitoring system might collect data on just a few questions related to beneficiary satisfaction.  

4.     Do you need comprehensive data?

A process evaluation is typically based on a sample, while monitoring is usually comprehensive. For example, in a teacher training program, you would monitor the training of all teachers (because it is useful to know exactly which teachers did not attend the training), whereas in a process evaluation, you would interview a subset of teachers to understand the reasons why they did not attend the training.