A real-world test of artificial intelligence infiltration of a university examinations system: a “Turing test” case study
Peter Scarfe, Associate Professor, Vision and Haptics Laboratory, University of Reading

back to Overview

Date: Wednesday, 22.11.2023 15:20-17:00 CET

Signup: If you would like to attend the talks please register here to get a Zoom link.

Abstract:

Generative artificial intelligence models have taken a major leap over the past year or so. The poster child for this is Chat GPT, a large language model designed to understand and generate text in natural conversation. There have been anecdotal and experimental reports of academics running assessment questions through ChatGPT, with it achieving excellent grades. If undetected, this poses a fundamental threat for the educational sector. We report a rigorous, blind study in which we injected 100% AI written submissions into the examinations system in five undergraduate modules, across all years of study, for a BSc degree in Psychology. In this naturalistic setting, we found that 94% of AI submissions were undetected. The grades awarded to AI submissions were on average half a grade boundary higher than that achieved by real students and across modules there was an 83.4% chance that the AI submissions on a module would outperform a random selection of the same number of real student submissions. We discuss our findings in terms how the educational sector will have to adapt to a “new normal”, which invariably will have to include AI.