The Best Edtech for Students Is Backed by Research. Here’s What to Look For.

In almost any school in the country today, you can find an app or program that claims to change education as we know it. Yet schools are littered with products that have not changed anything beyond teachers’ desktop screens.

As researchers focusing on education technology, we see this often: interactive whiteboards covered in posters, desktop computers holding up plants, older devices that do not work with a newer assessment system. The list goes on.

Our work at the nonprofit Education Development Center’s Center for Children and Technology focuses on how education technology can be used to support learning. The truth is edtech products that foster more learning than would happen in analogue settings can be difficult to find. When we get to see effective edtech products in practice, the view is exciting: We see kids engaged, teachers energized about the kinds of thinking their students are generating and strong learning outcomes that result from well-made tools matched to the students and educators using them.

So naturally, one of the big questions we face is, how can we help ensure effective edtech happens more often? The key lies in helping educators to look at the available evidence and make careful decisions. In many cases, that’s easier said than done.

Making Time for Research

The promise of media and technology to transform student learning has led schools in the U.S. to outlay an estimated $8 billion annually on software and digital resources, not including hardware. At the same time, educators face difficulty identifying edtech products that meet their needs and improve student learning. One recent study of school spending estimated that up to two-thirds of software licenses are never activated.

When it comes to making decisions about edtech purchases, teachers and administrators are pretty much on their own. There are few systems in place to help identify products that have been effective in producing positive learning outcomes—yet a flourishing business in promoting programs based on little or no evidence at all.

We recognize that conducting detailed reviews of evidence requires substantial time and resources. Teachers should not have ferret out findings from a stack of research studies just to figure out what programs will be a good fit for their students. But the alternative is even less palatable: spending money and time on products that don’t work and don’t support learning. To mitigate the risks inherent in trying out a potentially ineffective product, we suggest the time and effort spent evaluating the evidence should be proportional to the amount of time students will spend using the product.

Reviewing research studies to determine effectiveness can be a complex process. Part of that complexity is that decision-making is not typically linear but rather iterative, unfolding over time. However, we have identified a set of steps that any educator can work through and that can help simplify and streamline a search for effective ed tech products.

These guidelines are based on our years of experience designing and conducting studies to assess the impact of educational technology on learning and draw on what we have learned about sizing up products and their claims. Our hope is that by sharing these steps we can help teachers, school administrators and others tasked with selecting ed tech products to make choices that match the needs and learning goals of students, and that allow the potential of well-designed and researched edtech to flourish.

Finding Existing Evidence

Your first step in this process is locating any available evidence of impact, including research, evaluation or impact studies done on the product you are considering. Research databases, including Google Scholar, ResearchGate and JSTOR can serve as resources for peer-reviewed studies. A few websites offer summaries of rigorous research on educational interventions, including evidenceforessa.org, Best Evidence Encyclopedia and the What Works Clearinghouse. We also suggest emailing researchers for access to journal articles that are behind paywalls. (If no studies have been conducted, or if you are not provided access to them, then perhaps consider other edtech options that do have evidence.)

Here are five questions to ask when assessing the strength of the evidence. You should be able to find the answers to these questions in published studies—or you may have to ask the edtech vendor or research partner to provide the information you need. Once you have answered these questions you will have developed a rough study rubric to weight your confidence on whether the claims made about different products will work for your students or your school.

Does the study design provide a compelling comparison group?

It is essential to understand the comparison group (sometimes called the control group) because the strength of a research claim hinges on the extent to which there is a plausible comparison group.

A comparison group represents what would have happened without the intervention. Thus, the difference between the treatment and comparison groups is the impact. Below we include several ways in which studies may or may not provide a comparison group, leading to varying levels of strength in terms of study findings:

Randomized controlled study: These are among the strongest research designs: a researcher randomly assigns students into either a treatment group that is given an intervention or a comparison group that does not receive the intervention. Random assignment is intended to ensure the treatment and comparison groups are the same along both measurable characteristics such as test scores as well as unmeasured characteristics such as motivation to learn. Strong randomized designs need to ensure the comparison and treatment group are similar on baseline characteristics before the intervention begins. This could mean that the groups of students have similar scores on a state test or a pretest relevant to the study. But it also means that they are similar in other important characteristics, such as grade, class size, gender, English language ability and special education services.
Comparison group studies: Comparison group and “quasi-experimental” studies attempt to identify a roughly equivalent comparison group without randomization. This type of study is weaker than a well-designed random control trial because we cannot be sure the comparison group is similar to the group receiving the intervention. For example, a quasi-experimental study might test out a one-to-one laptop program with School A that is excited about new technology and then compare gains in state assessments before and after the program with School B. If School A’s teachers are more motivated and choose to participate in the intervention while School B’s teachers opted out, it is impossible to disentangle the effects of an intervention from the qualities of the teachers who elected to use it.
Student data for one group of students: Some studies include only one group of students—for example, providing teachers with an online reading curriculum to try out with students and describing change between state English language arts tests administered before and after using the curriculum. These studies provide the weakest evidence as it is impossible to distinguish the effect of the edtech product from the myriad interventions schools typically use, such as other reading-based activities throughout the school day, during the summer or in other subjects.

How does the study measure success?

Study-specific measures designed by the research team provide weaker evidence than commonly used measures with independently documented reliability and validity, such as state test scores or other third-party assessments. However, some interventions may have shorter- or longer-term impacts not easily measured in a school year, making use of more standardized measures a challenge. In this case, looking for indicators of implementation might provide information about the value of an intervention, such as how many teachers attended the professional development, how many teachers and students used the intervention in their classrooms and for how long during the school year.

Who did the research?

It is important to follow the money. Studies paid for by the organization or company that produced the program or resources tested are typically less reliable than studies that are externally funded. A company or organization’s interests in positive findings can bias or lead to selective reporting about the findings.

Mitigating biases is also necessary as even external researchers may add unhelpful subjectivity. Look for studies where the individuals who assessed children or analyzed results were blind to whether the students were in the treatment or comparison conditions. Likewise, place a higher premium on studies that have been pre-registered publicly (for example, with the Registry of Efficacy and Effectiveness Studies), which keeps researchers from being tempted to adjust analyses to find a positive effect.

How much learning happened during the study?

Even very small effects on learning can be “statistically significant” with a large sample size. A statistically significant finding is not enough to indicate a program or product “works.” The research should describe the size of the effect and explain whether it is meaningful, for example, describing how much students typically grow on the measure over a year and how the growth from this intervention is greater than would happen otherwise (e,g., students gained 3 months’ equivalent of targeted math skills) or how much they have grown from similar kinds of interventions. Moreover, make sure the outcome—test scores, absence rates, etc.—is something you care about.

Who participated in the study?

To have confidence that the findings of the study reflect something that would likely happen in another context, the sample needs to be relatively large. The larger the size, the more confident you can be about findings. As recommended under the federal Every Study Succeeds Act, a sufficient sample size should be around 350 students or more, or 50-plus classrooms or schools that each contain more than 10 students.

Also, the study sample should include a broad range of students, schools or districts similar demographically and facing similar constraints, capabilities and resources. Edtech can exacerbate inequalities if it is not accessible or adaptable to students who have different learning needs or if schools vary considerably in their ability to support things like consistent tech support or reliable access to high-speed internet.

Finally, if a tech product has great evidence but does not meet student needs, the evidence is meaningless. Structure your edtech purchase decision process from the beginning by focusing on student and teacher needs, identifying the underlying problem or challenge you want to address and its root cause. Then assess whether educational technology can be part of the solution.