-
An NYU professor turned to artificial intelligence for oral exams after student assignments began to resemble McKinsey memos.
-
He uses artificial intelligence agents to scale speaking exams, a format long considered too time-consuming.
-
The experiment comes as universities re-evaluate how to fairly test students in the age of artificial intelligence.
The homework looks wonderful. Understand no.
That’s when a professor at NYU’s business school decided to combat AI-assisted coursework with an AI speaking exam.
Panos Ipeirotis, a professor who teaches data science at NYU’s Stern School of Business, wrote in a blog post last week that he became concerned that student assignments read like “McKinsey memos” but lacked real understanding.
When he gathered students in class and asked them to defend their opinions, many struggled to do so.
“If you can’t defend your work live, then the written work isn’t measuring what you think it’s measuring,” Iperotis wrote.
“Fight fire with poison”
In order to solve this problem, he resumed oral examinations and hired artificial intelligence agents to conduct oral examinations on a large scale in an attempt to “retaliate with tooth.”
“We need to evolve assessment into a format that rewards understanding, decision-making and real-time reasoning,” Iperotis said.
“Oral exams used to be standard until they became unscalable,” he added. “Now, artificial intelligence is making them scalable again.”
In a blog post detailing the experiment, Ipeirotis said he and his colleagues built the AI checker using ElevenLabs’ conversational speech technology.
“Just write a prompt that describes what the agent should ask the student, and then you’re done,” he said, adding that setup takes a few minutes.
The speaking test has two parts. First, the AI agent asked students about their capstone projects, probing their decisions and reasoning. It then selects a case discussed in class and prompts students to think about it on the spot.
Over a period of 9 days, 36 students were assessed by the system. Each class lasted approximately 25 minutes, and the total computational cost for all 36 students was approximately $15. At teaching assistant rates, a manual oral exam can cost hundreds of dollars, Iperotis wrote.
Ipeirotis also uses artificial intelligence to score exams. Three AI models – Claude, Gemini and ChatGPT – independently evaluate each transcript. They then reviewed each other’s evaluations, revised the scores, and came up with a final grade, with Claude acting as “chair” to make the combined decision.
Iperotis said the LLM Council’s ratings are more consistent than human ratings and are “tougher, but also fairer.”
“The feedback was better than any human-generated feedback,” he wrote, adding that the AI analysis also exposed gaps in how the material was taught.
However, students were divided. Only a small minority prefer AI oral exams, with many finding them more stressful than written exams – although they admit they are a better measure of true understanding.
Still, Iperotis said the speaking exam showed “how learning should happen”.
“The more you practice, the better you will get,” Iperotis wrote.
Using artificial intelligence in exams
Ipeirotis’ blog post comes as universities grapple with how to test students in the age of artificial intelligence.
A paper published in September in the academic journal Assessment and Evaluation in Higher Education claimed that artificial intelligence has turned student assessment into a “wicked problem.”
The study’s authors interviewed 20 unit chairs at a large Australian university in late 2024. Through hour-long Zoom interviews, they found teachers overwhelmed by overloaded workloads, confusion about the use of AI and a lack of consensus on what AI validation assessments should look like.
Some teachers told researchers that AI should be viewed as a tool within students’ grasp. Others see it as academic dishonesty that erodes learning. Many admitted they were unsure how to proceed.
In May, LinkedIn co-founder Reid Hoffman said on an episode of his podcast “Possible” that artificial intelligence could make it easier for students to take advantage of traditional assessment formats, such as essays. Universities should rethink how they assess learning, he said, adding that students could soon expect an “artificial intelligence examiner”.
There is no room for shortcuts in the speaking test, which requires students to demonstrate true understanding, Hoffman said.
Read the original article on Business Insider