Detecting Persuasion and Manipulation in AI Feedback: A Pragmatic-Stylistic Lens
Fernando Rodrigues Peres, Universidade Estadual de Londrina (Brazil)
Edina Regina Pugas Panichi, Universidade Estadual de Londrina (Brazil)
Abstract
This study proposes and evaluates a pragmatic-stylistic framework to detect persuasion and manipulation in AI-generated feedback for second-language writing. We operationalize persuasive pressure and manipulative cues as clusters of pragmatic acts (advice, directives, threats to face), politeness strategies (negative/positive mitigation, hedging), and discourse-stylistic markers (stance adverbs, intensifiers, modal verbs, evaluative collocations, framing devices). Building a corpus of AI feedback messages from EFL classrooms (baseline vs. “empathy-tuned” assistants), we annotate nine phenomena—authority appeals, scarcity/urgency framing, moral pressure, bandwagoning, loaded questions, excessive certainty, guilt/shame triggers, conditional promises, and face-threatening imperatives—then train a lightweight detector combining function-word profiles, POS-trigram patterns, cue lexicons, and rhetorical-move segmentation. Human-in-the-loop calibration aligns thresholds with teacher judgments to reduce false positives for neutral encouragement and constructive challenge. In a classroom pilot, the system flags segments and offers “safe rewrites” that preserve pedagogical intent while softening manipulative load and increasing transparency (“why this suggestion,” optional sources, and learner agency prompts). We report detection performance, inter-rater agreement, and a small but meaningful reduction in learner-reported pressure without loss of revision quality. The paper contributes (i) an open, pragmatics-grounded annotation scheme for AI feedback safety, (ii) a compact detector suitable for school devices, and (iii) design patterns for teacher dashboards that surface persuasion risks alongside formative value. We discuss ethical guardrails (bias, overpersonalization, consent) and implications for policy and teacher training in AI-mediated assessment.
REFERENCES |
[1] BROWN, P.; LEVINSON, S. Politeness: Some Universals in Language Usage. Cambridge: CUP, 1987. [2] CIALDINI, R. Influence: Science and Practice. 5th ed. Boston: Pearson, 2009. [3]HYLAND, K. Stance and engagement: a model of interaction in academic discourse. Discourse Studies, 7(2), 173–192, 2005. |