Will Teachers Listen to Feedback From AI? Researchers Are Betting on It

Artificial Intelligence

Will Teachers Listen to Feedback From AI? Researchers Are Betting on It

By Olina Banerji     Sep 25, 2023

Will Teachers Listen to Feedback From AI? Researchers Are Betting on It

Julie York, a computer science and media teacher at South Portland High School in Maine, was scouring the internet for discussion tools for her class when she found TeachFX. An AI tool that takes recorded audio from a classroom and turns it into data about who talked and for how long, it seemed like a cool way for York to discuss issues of data privacy, consent and bias with her students. But York soon realized that TeachFX was meant for much more.

York found that TeachFX listened to her very carefully, and generated a detailed feedback report on her specific teaching style. York was hooked, in part because she says her school administration simply doesn’t have the time to observe teachers while tending to several other pressing concerns.

“I rarely ever get feedback on my teaching style. This was giving me 100 percent quantifiable data on how many questions I asked and how often I asked them in a 90-minute class,” York says. “It’s not a rubric. It’s a reflection.”

TeachFX is easy to use, York says. It’s as simple as switching on a recording device.

“With other classroom tools, I have to collect the data myself. And the data usually boils down to student grades,” York explains. But TeachFX, she adds, is focused not on her students’ achievements, but instead on her performance as a teacher.

Generative AI has stormed into education. Most of its applications, though, are either geared toward students (better tutoring solutions, for instance), or aimed at making quick, on-the-spot lesson plans for teachers.

Bubbling right under the surface is a key question: Can AI help teachers teach better?

“Teaching is hard. Helping teachers be the best version of themselves takes a huge investment of time and energy, and schools just don't have the resources. So most teachers don’t get the support they deserve,” says Jamie Poskin, the teacher-turned-founder of TeachFX.

Poskin says most teachers know good teaching practices, but need a little revision (or reflection) from time to time. These practices are largely based on giving students more voice in the classroom, so the balance of “talk” between a teacher and their students isn’t heavily skewed toward the former. For instance, teachers may consider replacing one-sided lectures with more group discussion, or they may make sure to ask follow-up questions to students’ answers.

“For student outcomes to change, something has to change about what the teacher is doing in the classroom. That behavior change is very hard,” Poskin says.

Poskin cites anecdotal evidence about teachers who, after using TeachFX, realized they were inadvertently calling on some students to discuss answers more than others. These students often tended to be white and fluent in English.

Poskin, who started TeachFX while still a graduate student, says he wanted to figure out how to help teachers improve their instruction in a scalable way. “When teachers make two recordings, we can already see them asking more open-ended questions in the second one. We’ve been able to create an inexpensive observer effect,” Poskin claims.

These observations generated by AI can take quick effect. Keara Phipps, an elementary school teacher from Atlanta, says that TeachFX showed her she “talked too much” in her classes. With that feedback, Phipps brought down the ratio of teacher-to-student talk to 50:50. “Students should be equal participants in their learning,” says Phipps.

Many teachers might be surprised to realize just how much they speak compared to their students.

“We did a study of 100,000 hours of audio of non-TeachFX users. You want to guess how much the average student spoke in one hour of class?” Poskin says. “Seven seconds, per hour.”

TeachFX is the visible front-end of a collective effort that’s using AI to scale effective, quick and completely personalized feedback to teachers. At the Institute of Cognitive Science at the University of Colorado Boulder, Jennifer Jacobs has put raw classroom audio through automated speech recognizers and then natural language processing to generate feedback that tells teachers how many times they followed a “good” classroom practice, like asking their students to give the evidence behind an answer. Her application is called TalkMoves, and a version of Jacob’s research is now being used by the tutoring company Saga Education to train first-time tutors.

This kind of personalized feedback, made possible by AI, isn’t place- or time-bound, and that’s what makes it scalable, says Yasemin Copur-Gencturk. A researcher at the University of Southern California, she has been working on AI-based professional development for math teachers for several years.

Initially, she claims, there was pushback. “Many did not see the need for this kind of PD,” she says.

Copur-Gencturk persisted, supported in part by a federal grant, to create a tutoring-style platform for teachers, as yet unnamed. It features a talking digital avatar that helps teachers unpack common misconceptions that their students carry in mathematics. “If teachers know how students are going to respond to a learning activity, they can tailor their instruction,” says Copur-Gencturk.

AI-based professional development is gaining traction at a time when a record number of teachers are feeling burned out, underpaid and demoralized about their profession. The makers of these AI tools believe that technology can help stem the tide out of the profession. While tools can’t necessarily replace human coaches or in-depth professional development that districts conduct, they can help teachers take stock, and correct course.

Copur-Gencturk says the frequency and quality of the feedback shouldn’t depend on how rich or poor a school district is. All teachers should have equal access to tools that can improve their teaching. Yet for that to happen, these fledgling tech solutions need to find a way to pay for themselves, or convince early adopters to shell out.

“I wanted to get TeachFX for my entire school. But even for a small cohort of 10 teachers, they were going to charge the school $5,000 per year,” York says — the average cost for a pilot package. That’s much more than a department’s annual budget in her school, says York.

AI tools will also have to have to reckon with teacher concerns about where all that data about their instruction ends up.

Peeking Into a Black Box

Providing teachers with one-on-one, personal feedback is an ambitious goal. But it’s humanly impossible to bring that level of attention to every teacher’s class. It’s time- and cost-intensive, and potentially intrusive to teachers who don’t want to feel judged for their teaching styles.

“This is why the computational power we have now is exciting. Large language models can analyze classroom discussions at scale. To get more evidence out of a classroom is a precursor to explain everything else, like [understanding] student outcomes,” says Dora Demszky. Demszky is an assistant professor in education data science at the Graduate School of Education at Stanford University, and she’s part of an expanding group of academics feeding classroom audio to large language models to generate automated feedback for teachers.

The audio-to-AI tool works like this: Recordings from a classroom, which include both teacher and student voices, are fed to a large language model. This has been trained, generally, on what “good” teaching practices sound like. For instance, if a teacher asks follow-up questions, or asks students to argue their point, the model is going to pick it up, identify it as an action, and show the teacher how many times they did that action in class. Both Poskin and Demszky say that the data itself doesn’t qualify their instruction style as a good or bad one, but rather offers a neutral report.

In May, Demszky and her colleague released findings from a study they conducted on more than 1,100 tutors who were teaching a free introductory coding course to about 12,000 students online. The tool they developed, M-Powering Teachers, led the tutors to reduce their own talk time by 5 percent in mentoring conversations, and their “uptake of student contributions” was up by 13 percent. “Uptake” here refers to a teacher revoicing a student’s contribution, elaborating on it or asking a follow-up question — teaching practices that give students more agency. These increased numbers, Demszky claims, offer good evidence that teachers can quickly respond to, and incorporate, objective feedback.

Evolving AI technology has made this feedback sharper. Poskin says the TeachFX application can pick out the richest teaching moments — like asking students follow-up questions, and affirming student responses — from classroom audio, and then show teachers how many times they employed these strategies. This feature wasn’t possible to add six months ago.

Jacobs, the researcher from the University of Colorado Boulder, conducted her own study in 2019 for an application that her team developed called TalkMoves. Jacobs has been working on a version of TalkMoves since 2017, thanks to a couple of grants she received from the National Science Foundation. Jacobs gave educators cameras to record videos in their classrooms, and then automated speech recognizers extracted audio, fed it to the natural language processing models and logged the teachers’ speech according to certain “discourse” markers that the model had been trained on. The TalkMoves application was one of the first apps of its kind to include a teacher interface that displays feedback in an accessible manner, claims Jacobs.

When COVID-19 hit during the study, in-person recordings had to stop, but Jacobs says some teachers continued to record their online classes. In the second year, when some of the instruction became hybrid, teachers recorded both online and offline instruction. The dataset shrunk from 21 to 12 teachers between the two years, but Jacobs observed an increase in teacher activities, or “moves,” like getting students to relate to each others’ answers — an improvement that researchers attribute to teachers using feedback from TalkMoves. Interestingly, says Jacobs, there wasn’t a significant difference between online and offline recordings when it came to the uptake of “good” talk moves by teachers.

Mandi Macias has personal experience with this kind of evolution. She’s taught fifth grade for 25 years in the Aurora Public School system in Colorado. After teachers there asked for better professional development tools, the principal at Macias’ old school introduced TeachFX. Macias used TeachFX every week last year and claims that she has since changed her whole teaching style from “lecturing” to “asking questions.”

“Students are also doing the heavy lifting with me in class. I’m not satisfied when they just agree or disagree with each other. They can now bring the best evidence for their answers,” Macias says.

Being able to listen to her class recordings — coupled with the TeachFX data dashboard — meant Macias could create a new model of conversational learning for her class. Currently Macias says she doesn’t have access to TeachFX since she switched schools.

Getting Personal With Professional Development

Not all teachers may need or have time to sift through the transcripts generated by TeachFX and similar tools. York, the teacher from South Portland High School and Macias, the teacher from Aurora Public School system, both agree that teachers have to put in the work to change, once they see the data.

“I’ve been in PD sessions where teachers fall asleep or walk out. Teachers often make the worst students,” says York.

But what’s undeniable about TeachFX’s feedback and Copur-Gencturk’s digital mentorship platform is that all this data is personal. This is why the one-on-one sessions work, says Copur-Gencturk.

Her solution involves a low-voiced AI mentor that pops up on one side of the screen (like a colleague in a Zoom call), and walks teachers through different problem sets. This kind of professional development looks most like what students might go through with an AI assistant. Teachers can either type or voice their responses.

Copur-Gencturk spent two years building the dataset that would eventually train the AI tutor. For this, she had to log every conceivable problem that students might encounter in a math lesson. For instance, students could have challenges moving from simple addition to the multiplicative reasoning that’s needed to study ratios. “Teachers need to know how students are approaching a math problem and what their responses indicate about their understanding. The program helps teachers ask the right questions to find out,” says Copur-Gencturk. The mentoring is punctuated with actual classroom videos that show teachers how these problems are solved.

The system has checks and balances, because the AI doesn’t let teachers move on to the next activity until their response meets the learning goals of the set activity, says Copur-Gencturk. This could feel limiting, except teachers have the option to pause and come back another time. This isn’t possible with in-person professional development.

A screenshot of Copur-Gencturk’s AI-tutoring platform.

Copur-Gencturk wants this AI program to become a part of pre-service teacher training, especially for math. What would be even better is to link student diagnostic tools with the kind of professional development she’s building. That way, says Copur-Gencturk, teachers will know what misconceptions to attack.

The Personal Is Also Private

Both TeachFX and the virtual assistant have common goals: make professional development personalized, safe and easily scalable. If it’s priced competitively — the AI mentor isn’t a commercial product right now — then personal professional development can also be accessible to every teacher.

Teachers, the target of all these innovations, have to be on board. York says she loved working with TeachFX, but when she sent it out to a group of 80 fellow teachers in her district, she got zero sign-ups. “There’s no judgment here. They may not have had the time. But some CS [computer science] teachers just didn’t want to know feedback about their instruction,” says York.

Teachers don’t always want to be recorded because, York claims, the data could become punitive in districts’ hands. Poskin, of TeachFX, asserts that the data the tool collects is only intended for the teachers’ personal use, unless they choose to share it with a mentor or observer.

The issue of data sharing is a sensitive one, says Demszky of Stanford, and rightfully so. Making sure that the classroom data is only shared with the right people is the first step.

Demszky admits there has been a mixed reception from school districts — some are more open to tech innovation than others. “Teachers are already using tons and tons of tools where their data is being shared. It’s happening in many contexts. This is a new context we are trying to share data in,” says Demszky.

Phipps, the teacher from Atlanta, says teachers may find it difficult to take constructive criticism from an app’s feedback. “This isn’t subjective. It’s taking a deeper look at your work. You’re going to have to change something when you look at this data,” Phipps says.

New personalized professional development tools will need their own champions and early adopters. Phipps says she’s open to observers looking at her classroom data, and she already has suggestions for TeachFX: a crossover app with Swivl, a classroom management tool that records teachers as they move around a classroom.

“Then I can see and hear what’s going on. It could spark new seating ideas, for example,” Phipps says.

York says she already had an open-door policy about her teaching style. She teaches a diverse set of students, some of whom are learning English, and she wonders whether TeachFX can evolve to better support them.

“It would be interesting if the app picked up the many languages spoken in class. Or if it picked up students translating for each other,” York says. “How many times is more than one person speaking? How many times are groups talking?”

But York is willing to give it more time before expecting these tools to become perfect.

After all, she says, “We didn’t expect Siri to pick up all our idiosyncrasies from day one.”

Learn more about EdSurge operations, ethics and policies here. Learn more about EdSurge supporters here.

More from EdSurge

Get our email newsletterSign me up
Keep up to date with our email newsletterSign me up