Help! This Edtech Company Says It Uses AI. (What Does That Mean? What Should I Ask?)

You know artificial intelligence has hit mainstream when the elderly grocer at the local supermarket asks how he can get into AI.

It wasn’t always this way. When I started graduate-level work with machine learning tools at UC Berkeley in 2011, few people outside academia were discussing artificial intelligence. Commercially-viable AI still seemed pretty far off. When I co-founded WriteLab in early 2014, a few prominent labs were visibly but cautiously putting together startups powered by that technology.

Come 2017, and AI—or at least the buzz about it—is everywhere. Whether you’ve encountered it as machine learning, natural language processing, or image recognition, there is more and more hype about the power of AI. It’s exciting, to be sure, but for the education community, it can be difficult to pinpoint the utility and efficacy of this technology. Several reasonable questions come to mind: What exactly is it? How can we trust it? And does it address real pain points for teachers and students?

Below are a few questions to help educators and investors make sense of it all, especially right after an entrepreneur mentions how well his or her convolutional neural net is working. (Say what?)

Questions Educators Should Ask

How is this going to help me do what I already do, but better and faster?

I’ve seen two major kinds of AI-based education tools: those that operate by themselves and are fully automated, and those that assist and gradually learn from teachers, students, and administrators as they use the products.

The first bucket of products typically boast features like chatbot capability, plagiarism detection and automated essay scoring. The AI powers through large datasets to produce its results, but there is often no person in the workflow to correct the output of the machine. These are tools that may be difficult to evaluate holistically, since you have to determine whether the output makes sense on a case-by-case basis.

The latter category focuses on making certain workflows—such as grading, managing enrollments, and scheduling classes—more efficient. These are the tools that will be easier for educators to test and assess, since it should be clear how well these tools fit in with their existing processes (and whether they enhance them). It should be easier to feel out whether these tools are automating the proper kinds of grunt work—and how precisely they are doing so.

For those AI-based education tools that don’t solve a specific workflow problem, entrepreneurs will struggle to answer how it helps you get faster or better at your job. If you don’t receive a satisfactory answer, you might dig a little deeper and ask harder questions.

How reliable is the data that the AI is trained with?

Artificial intelligence only works when the system is properly trained to make decisions based on an existing data set. So it’s critical that the data is both consistent and accurate, and that it reflects the logic of your educational priorities and processes.

For most AI apps in education, engineers rely on a process called “supervised learning,” which requires the data to be annotated in some way, such as with a graded assignment. Consistency in this case means that graders are evaluating essays based on the same rubric, and that they award or subtract equal points for the same errors.

Unfortunately, most datasets are not this fine-grained, since it’s painstaking to consistently grade a large volume of assignments. There’s a very real human reason for this. If a teacher had to grade 200 essays in an 8-hour period, he or she would most likely experience some decision fatigue, which can lead to variations in the grades given.

To the best of their abilities, software engineers try to tweak the AI to account for the conditions under which a teacher was grading (and thus labeling) the data. But this is an imperfect science. There’s a difference between grading 30 essays overnight and doing the same over the course of a weekend. Accuracy is a complex criterion to satisfy.

As hard as accuracy is to assess, it’s harder still to evaluate whether the data truly represents the activity the AI will automate (e.g. student writing tasks and associated writing responses). This is where alignment with the curriculum is a key factor. Does the essay data that the AI is trained with reflect the kinds of writing assignments you give in class?

Like with many other edtech tools, you will want to evaluate the product by having someone you trust (yourself or a colleague) test the product, looking not so much for errors as for general utility and value. If you examine the product trying to find bugs in it, you will almost certainly find some.

How relevant is the AI to my specific teaching materials?

Certain AI-powered edtech tools can be limited in that they only work with a narrow set of curriculum materials, prompts or questions. That means that teachers using these products will also face these limitations. For teachers that have great confidence in their existing workflow and materials, these tools can seem limiting or distracting. Often these tools do not address the top pain points of teachers, and are more nice-to-haves than necessities.

Questions Investors Should Ask

Is all the work done by AI, or is there a human needed to verify the output?

This question is not (only) to sound smart, but it will also help you add contour to an otherwise amorphous buzzword. If the entrepreneur answers that the tool is 100% automated, you might ask: would you consider adding a human to workflow?

This might surprise the entrepreneur who wants to minimize costs. It will, however, ensure a higher quality of data and (hopefully) a competitive advantage. All the better if that human in the loop is also a user of the tool.

If the entrepreneur seems confused, you might want to note that the processes AI aims to automate are complex, involving many edge cases that are hard to represent in training and test data. (If they weren’t so complex, explicit programming—instead of probabilistic AI—would suffice.)

Since AI requires data, the more complex the tool, the higher the cost of maintaining it. Costs for data can range from a few hundred dollars to over $100,000 per year. Most of this goes to paying experts to label the data. If a few thousand dollars of labeled data is enough to fully automate the system, the data doesn’t offer a competitive advantage.

If, however, the AI tool is actively used to augment someone’s job, maintaining and refining the tool is far less risky and costly, since it is learning from the data generated by those using the product.

Including people in the loop will also reinforce the golden rule in education technology: that tools are here to help, not replace, teachers. If, by using an AI product, educators understand that they are also helping their peers, they may be more likely to feel gratitude than resentment, since their job is not directly at stake from this technology. The data is more relevant, customers are happier, and, piece by piece, the product provides more value to users.

What impact does AI have on your business model?

An answer to this question can help you de-risk the investment. Most successful AI companies leverage their users to label data, often unconsciously through the user experience. The most inventive of them craft their entire apps around optimizing the labeling experience for their users. Facebook and Google excel at this with photo labeling and message tagging.

In a freemium business model, users can provide training material for the models that drive the AI for the product, provided that the product offers enough useful functionalities for free. (If all of its value is behind a paywall, why would anyone use and provide training data?)

This is of course only one way to guarantee good data flow. But if you’re optimizing for both a sound business model and a competitive AI advantage, you’ll need to see that both cash is flowing through the business and data is flowing through the app. That sounds obvious, but designing the proper inputs for users to give meaningful training data is easier said than done.