IDi-Logo-Master-RGB-White-1-1.png

Impact Measurement Guide Beta

Process evaluation case study: CSF EdTech

Process evaluations of 12 EdTech solutions helped the nonprofit Central Square Foundation identify which products to test for scale-up across India.

Background

The vast majority of education technology (EdTech) in India caters to students from high-income households. For EdTech to have an impact on education at scale, more products must be developed that offer vernacular languages, use appropriate cultural references, target a range of learning levels, and are sold at lower price points.

EdTech is thus a core part of the strategy of the Central Square Foundation (CSF), a nonprofit working towards ensuring quality school education for all children in India. CSF is working to create a pipeline of contextualized EdTech products for low-income students, generate evidence around the efficacy of such solutions in India, and catalyze large-scale government adoption of impactful solutions.

Question

CSF sought to identify high-performing EdTech solutions, which it would adapt for implementation by the government. These solutions would be tested, and impactful products would ultimately be scaled up across the country.

To identify which products CSF should consider for potential scale-up, IDinsight conducted rapid process evaluations of 12 promising EdTech solutions. The goals of the process evaluations were twofold:  

  1. Evaluate how well each product was functioning currently
  2. Evaluate how appropriate each product would be for scale-up in government schools

Approach

Evaluation strategy

We evaluated products across five research categories: design, provision of support by product-maker, adoption of product, student engagement with product, and perception that using product would increase learning. For each research category, products were given a rating of ‘poor,’ ‘satisfactory,’ or ‘high’ performance.

Data collection

The data for the evaluation came from three sources: a) interviews and surveys with students and school staff, b) backend user data, and c) expert panel review.  

 1.  Interviews and surveys with students and school staff

In total, we collected information from over 1,600 school staff and students in 10 states.

For EdTech solutions meant to be used in schools, research teams visited 3-5 schools per product to collect data on user perceptions and experiences. Schools were sampled purposively, targeting variation such as rural or urban settings, varying use levels, and differences in implementation models. For each product, we:

  • Conducted semi-structured interviews with 10-15 teachers, 3-5 administrators, 10-15 students
  • Administered paper-and-pencil surveys to 100-150 students
  • Observed 3-5 product use sessions in classrooms

For the three EdTech products meant to be used at home, research teams conducted phone interviews with 183, 154 and 64 students respectively and about 30 parents in total. The sampling process was different for each of the three products due to operational constraints:

  • For the first product, we sampled users purposively based on use behavior
  • For the second product, a random sample from the population of users was drawn
  • For the third product, the product company process requested users to sign up for interviews

 2.  Backend user data

We collected backend user data from product companies where possible, collecting data for over 17,000 users and 3,900 schools. The data focused on the behavior of students and teachers, such as login frequency, questions attempted, and resources accessed. This data was used to understand product use trends, such as frequency of engagement.

 3.  Expert review

Each product was reviewed by 4-6 education and technology experts. They reviewed the product based on content, pedagogy, instructional design, user experience, and backend technology.

Results

  1. CSF used these results to inform their EdTech strategy as they work to build a pipeline of promising, contextualized solutions to be tested and scaled in government schools.
  2. The published findings gave EdTech companies, foundations and governments a nuanced understanding of different aspects of products and how they interact with different use cases.
  3. The tools created for the rapid process evaluations have been used by state governments and other foundations to evaluate EdTech products. CSF continues to build on and enhance the framework created in this project in the subsequent versions of the EdTech Lab.
  4. The evaluations revealed gaps in the EdTech marketplace that could be met by funders and product companies, such as developing quality content in non-English languages.
  5. The process evaluations also revealed strengths and opportunities for improvement across products, such as:  
    • In many cases, EdTech content was at an appropriate level and the product was consistently used in the way intended.
    • Few products matched instruction to the learning level of students. Teachers and student use patterns showed that both user types looked for capability-aligned content when available.
    • Product companies rarely designed the products to work well for low-income students. For example, few products had quality non-English language navigation or content, a critical feature given limited English proficiency among this target group.
    • Practitioners often had difficulty effectively implementing products. Despite receiving training, teachers across products and implementation models needed regular, sometimes daily, help with simple tasks such as start-up and basic navigation.

Guide

Not sure where to go from here? 
Use our guide to frame a question and match it to the right method.

Process evaluation?

Find out more about process evaluations and how to conduct one.

Create your theory of change

Use our drag-and-drop tool to create a theory of change diagram

Types of impact evaluations

Impact evaluations compare the people who receive your program to a similar group of people who did not receive your program. Based on how this similar group is chosen, impact evaluations can be randomized controlled trials or quasi-experimental.

A randomized controlled trial (RCT) is considered the gold standard of impact evaluation. In an RCT, the program is randomly assigned among a target population. Those who receive the program are called the treatment group; those who do not receive the program are called the control group. We consider the outcomes from the control group to represent what would have happened to the treatment group in the absence of the program. By comparing outcomes among those who receive the program (the treatment group) to those who don’t (the control group), we can rigorously estimate the impact of the program. The random assignment of the program to the treatment and control group provides the rigor, as it ensures that the selection of people is not based on biased criteria that could affect the results.

When a randomized design is not feasible, there are other, “quasi-experimental,” ways of constructing a valid comparison group. 

  • In a matched design we would match individuals who receive the program to individuals who don’t receive the program based on some observable characteristics (such as age, gender, number of years of schooling, etc.), and compare outcomes across these groups. 
  • Another common technique is regression discontinuity design, in which you create a cutoff based on which individuals are eligible to receive the program, and then compare outcomes from groups just below and just above the cutoff to estimate impact. 

Matched designs and regression discontinuity designs are just two of many quasi-experimental techniques. J-PAL provides an overview of common methods of conducting an impact evaluation. All such methods seek to identify what would have happened to your target population if they had never received the program, and their success relies on the strength of the assumptions they make about whether the comparison group is a credible stand-in for your program’s target population. 

Recommendation: Theory of Change

A theory of change is a narrative about how and why a program will lead to social impact. Every development program rests on a theory of change – it’s a crucial first step that helps you remain focused on impact and plan your program better.

You can use diagrams.net to create your Theory of Change.

Recommendation: Needs assessment

A needs assessment describes the context in which your program will operate (or is already operating). It can help you understand the scope and urgency of the problems you identified in the theory of change. It can also help you identify the specific communities that can benefit from your program and how you can reach them.

Once you’re satisfied that your program can be implemented as expected:

Your program's theory of change

You have a written plan for how your program will improve lives – great! Make sure to refer to it as you explore the different sections in the Impact Measurement Guide, as it is the foundation for any other method you’ll use.

Recommendation: Process Evaluation

A process evaluation can tell you whether your program is being implemented as expected, and if assumptions in your theory of change hold. It is an in-depth, one-time exercise that can help identify gaps in your program.

Once you are satisfied that your program can be implemented as expected:

Recommendation: Evidence Review

An evidence review summarizes findings from research related to your program. It can help you make informed decisions about what’s likely to work in your context, and can provide ideas for program features.

Once you are satisfied with your evidence review:

Recommendation: Monitoring

A monitoring system provides continuous real-time information about how your program is being implemented and how you’re progressing toward your goals. Once you set up a monitoring system, you would receive regular information on program implementation to track how your program is performing on specific indicators.

Once you are satisfied with your monitoring system:

Recommendation: Monitoring

A monitoring system provides continuous real-time information about how your program is being implemented and how you’re progressing toward your goals. Once you set up a monitoring system, you would receive regular information on program implementation to track how your program is performing on specific indicators.

Once you are satisfied with your monitoring system:

How to compile evidence, method 2

You’re on this page because you want to search for evidence relevant to your program.

Here are some academic sources where you can search for relevant research:

  • Google Scholar is a search engine for academic papers – enter the keywords relevant to your program, and you’ll find useful papers in the top links
  • Once you identify some useful papers, you can consult their literature review and bibliography sections to find other papers that might be relevant
  • Speaking to a sector expert can guide you to useful literature

 However, don’t include only academic studies in your review! You should also consult:

  • Policy reports
  • Websites of organizations involved in this issue, such as think tanks, NGOs, or the World Bank
  • Public datasets
  • Your program archives – data and reports from earlier iterations of the program can be very valuable!

 The free Zotero plug-in provides an easy way to save, organize, and format citations collected during internet research. Note that Zotero can help you start your annotated bibliography, but it is not a substitute since it does not include any summary or interpretation of each study’s findings.

How to compile evidence, method 1

Start with the 3ie Development Evidence Portal, which has compiled over 3,700 evaluations and over 700 systematic evidence reviews. Steps 2-7 of this example are specific to locating evidence on the 3ie website, but you can also consider looking for a review by J-PAL or Campbell Collaborations or the Cochrane.

For example, suppose your goal is to increase immunization rates in India. Type “immunization vaccination” or other related terms into the search box, and click the magnification lens to search.

The search results include individual studies, which are usually about a single program in a single location, as well as “systematic reviews”, which is what we are looking for because they are more comprehensive. To show only the systematic reviews, on the left of the screen under Filter Results, click on PRODUCTS and check the Systematic Reviews box. We’re now left with 17 evidence reviews related to immunization.

Now you might want to further narrow your search by region or country. In our example, suppose we want to see only those evidence reviews that contain at least one study from India. Click on COUNTRY and scroll down to click on India.

There are still 9 evidence reviews! Now read the titles of each review and start going through the ones that seem applicable to you.

  • Note that they are sorted with the most recent first, which is helpful as newer reviews tend to be more comprehensive.
  • 3ie has made the hardest part – assessing how strong the evidence is – easy for us. They use a 3-star scale to indicate the level of confidence in the systematic review.
  • In this example, the most recent review, Interventions For Improving Coverage Of Childhood Immunisation In Low- And Middle-Income Countries, is also the only one rated 3 stars (high quality). Click on its title.

The next page gives you an overview of the study. If it is “Open access”, this means you can read it for free – click “Go to source” below the star rating. If it isn’t open access, you can try some of the strategies in Step 9 to see if you can find the study for free elsewhere.

Clicking on “Go to source” opens a new tab with a PDF of the article. Don’t be intimidated by the length and technical terminology, and start with the summary – these articles usually include an “abstract” and sometimes a “plain language summary” and/or “summary of findings”.

The summary will likely be useful but too vague – dig into the review and look for details about which programs were tried and where, and how well they worked.

  • Keep track of which programs and studies seem particularly relevant, so that you can look them up later.
  • Consider the conclusions of the authors of the systematic review – are there trends that emerge across countries and contexts that are relevant for you? Overall, what are the main lessons from this review that you take away?
  • Be sure to read with a skeptical mindset – just because something worked in Japan doesn’t mean it will work in India – nor does it mean it won’t. And just because something worked in India before doesn’t mean it will continue to work – context is more complicated than country! Think about the evidence you find as well-documented ideas, but not the last word.
  • You can skip the parts about the methodology followed by the authors of the systematic review.
  • If the systematic review was helpful, add it to the bibliography.
  • Copy the citation from the references section and paste it into a search engine.
  • Usually, the first result will be the full paper on a journal’s website. If it is open access, you can read it directly. A lot of academic literature is not open access, unfortunately. Here are some tricks you can try if the article is not available:

a) Go back to your search and see if you can find a PDF posted on one of the authors’ websites – authors often share “working papers”, which might differ only slightly from the final paper, for free on their site.

b) Email the paper’s authors if you can’t find it elsewhere – many researchers are happy to share a copy with people looking to learn from their experience.

  • Read the paper. You probably don’t need to read all of it – the abstract, introduction, a description of the program, the results, and the conclusion are probably enough, whereas sections on technical methods can be skipped. This paper may also include references to other literature that could be relevant to your program. Keep track of these references so you can look into them later.
  • Add the paper to the annotated bibliography if it seems relevant.
  •  

For each piece of evidence that you find, there should be a clear justification for including it in the bibliography, such as: it is a landmark study in the topic (i.e. it has a large number of citations or is cited by many other studies in your review), it is relevant to specific aspects of this evaluation (such as measuring similar outcomes, being conducted in a similar context, or evaluating a similar intervention), etc. However, there are no absolute standards for inclusion, and since not all studies will be used in writing up the review, it is better to err on the side of including a study in the annotated bibliography.

Repeat step 9 for every paper from the systematic review that you found relevant!

  • If an organization ran the program that was evaluated, you might be able to find information on the organization’s website. If not, try emailing the organization.
  • Sometimes people share program details elsewhere – blogs, policy briefs, videos, etc. Try searching more about the program and you might find something.
  • In our example, we found a review from 2016, which is quite recent. However, keep in mind it can take 1-2 years to write and publish a review – and since the review is citing only published literature, the cited articles would be a year or two old as well. This means that it is likely that this review is missing anything done since 2014 or 2013, and hopefully the world has learned a lot about how to address your problem since then. While this shortcut helped you find relevant evidence, you might still want to use Method 2 so that you can see what has been learned since 2014.

 

  • This method used only academic sources. Non-academic sources are also a very useful source of information, and you should look into them. In Method 2, we have included a list of non-academic sources to consult.

Process evaluation vs monitoring: Which one do you need?

Process evaluations and monitoring both provide information on how your program is running and whether it is meeting expectations. The key difference is that process evaluations are a one-off activity, while monitoring is ongoing. That means that process evaluations are often more intensive exercises to collect more data and dive deeper into the theory of change. In contrast, ongoing monitoring must not overburden program staff and often tracks just a few high-priority indicators that are critical to program success.  

Consider the following questions to help you decide between conducting a process evaluation and building a monitoring system:  

1.     Are you interested in identifying specific problems or general problems? 

Process evaluations typically identify general or systemic problems along the theory of change, whereas monitoring typically identifies specific entities (e.g. service providers or locations)that need more attention.    

2.     Are you looking to hold program staff accountable?   

Both process evaluations and monitoring are implemented for learning – is our program being implemented as planned? If not, at which steps is it breaking down? However, if you are seeking an accountability system, monitoring is better-suited as it is continuous, whereas a process evaluation is a one-time exercise.  

3.     Do you need ongoing data on how your program is performing?  

A process evaluation typically offers a snapshot in time, whereas monitoring involves ongoing data collection and analysis for the entire duration of the program. For example, a process evaluation may do in-depth interviews with program participants on their experiences, whereas a monitoring system might collect data on just a few questions related to beneficiary satisfaction.  

4.     Do you need comprehensive data?

A process evaluation is typically based on a sample, while monitoring is usually comprehensive. For example, in a teacher training program, you would monitor the training of all teachers (because it is useful to know exactly which teachers did not attend the training), whereas in a process evaluation, you would interview a subset of teachers to understand the reasons why they did not attend the training.