Investigating Universal Adversarial Attacks Against Transformers-Based Automatic Essay Scoring Systems

Silveira, Igor Cataneo; Barbosa, Andre; da Costa, Daniel Silva Lopes; Maua, Denis Deratani

Full text
Author(s):	Silveira, Igor Cataneo ; Barbosa, Andre ; da Costa, Daniel Silva Lopes ; Maua, Denis Deratani Total Authors: 4
Document type:	Journal article
Source:	INTELLIGENT SYSTEMS, BRACIS 2024, PT II; v. 15413, p. 15-pg., 2025-01-01.
Abstract
Automatic Essay Scoring promises to scale up student feedback on written input, addressing the excessive cost and time demand associated with human grading. State-of-the-art automatic scorers are based on Transformers-based neural networks. While such models have shown impressive results in reasoning tasks, learned models often produce answers that arise from statistical clues in datasets and are misaligned with human objectives. Such systems are thus potentially fragile for scenarios where users are incentivized to deceive the system, as in a classroom setting. In this work, we evaluate the susceptibility of state-of-the-art automatic scorers to attacks made by non-expert users, such as students interacting with an automatic grader. We develop a methodology to simulate such student attacks and test them against scorers based on BERT, Phi-3 and Gemini models. Our findings suggest that (i) a BERT-based grader can be deceived using simple feature-based attacks; (ii) although Google's Gemini has a solid agreement with graders, it can assign undeservedly high grades for small sentences; (iii) a Phi-3-based grader was less susceptible than BERT, but it still assigned relatively high grades to some of our attacks. (AU)

FAPESP's process:	19/07665-4 - Center for Artificial Intelligence
Grantee:	Fabio Gagliardi Cozman
Support Opportunities:	Research Grants - Research Program in eScience and Data Science - Research Centers in Engineering Program


FAPESP's process:	22/02937-9 - Neural inductive logic programming
Grantee:	Denis Deratani Mauá
Support Opportunities:	Research Grants - Initial Project

Short URL