One Class Used AI, One Didn’t. Their Exam Scores Were the Same
As educators confront the rapid advancement of artificial intelligence and its role in the classroom, a semester-long experiment at the University of Massachusetts Amherst found that structured use of generative AI improved student engagement and confidence but did not raise exam scores.
Christian Rojas, professor and chair in the Department of Resource Economics, led the controlled study in two back-to-back sections of an upper-division antitrust economics course with the same assignments, lectures and exams. One section of 29 students was encouraged to use AI tools such as ChatGPT with structured guidance and disclosure requirements, while a second section of 28 students was barred from using AI and received parallel non-AI study support.
Rojas and co-authors Rong Rong, associate professor of resource economics, and Luke Bloomfield, senior lecturer of resource economics, found no measurable effect of AI on exam scores or final grades, but the AI-permitted class consistently reported higher satisfaction. Students with access to AI participated more in real-time classroom activities, concentrated their AI use into longer study sessions and developed more reflective learning habits, such as editing AI outputs, catching mistakes and preferring their own answers over the machine’s.
It’s not that AI helped students learn more—it helped them learn more efficiently and confidently. They spent less time outside the classroom on homework and exam preparation.
Christian Rojas, professor and chair in the Department of Resource Economics at UMass Amherst
“It’s not that AI helped students learn more—it helped them learn more efficiently and confidently,” says Rojas, who taught both sections of the course. “They spent less time outside the classroom on homework and exam preparation.”
Standardized course evaluations completed at the end of the semester were also significantly higher in the AI section, particularly in ratings of instructor preparation and use of class time. In addition, students with AI access were far more likely to say they intended to pursue careers that involve intensive use of AI.
Rojas says the results should reassure educators that integrating AI into coursework can be accomplished without sacrificing academic rigor. He suggests a “permit with scaffolding” approach, where students are provided explicit instructions about effective uses of AI and clear disclosure requirements.
“There’s an opportunity for instructors to be more open about AI usage. Letting students engage with it just creates a different environment,” Rojas notes. “It’s been super impactful for me and the way I think about teaching.”
The experiment intentionally assigned the permission to use AI to an afternoon section, which historically performs slightly worse, making it a conservative test against detecting outsized AI effects. Students in both sections were assessed with paper-and-pencil exams, where notes, AI or other technology were not permitted.
Even though the study spanned a considerable amount of time, Rojas cautions that many of the outcomes were self-reported by students and that the experiment involved a small sample size.
The study is detailed in a new working paper, “Allowing Generative AI in Class: Evidence from a Semester-Long Controlled Teaching Study,” which has been submitted for publication in a peer-reviewed journal.
Related
UMass Amherst researcher Rodrigo Zamith says the profession’s AI future is now.
Dillon, professor of linguistics in the College of Humanities and Fine Arts, has been awarded a four-year, $432,656 research grant from the National Science Foundation to investigate how artificial intelligence systems and humans differ in the way they process language.