"Agents of Chaos" – What Happens When AI Agents Run Unchecked
The paper 'Agents of Chaos' (arxiv: 2602.20021) documents a red-teaming experiment by 14 researchers from Northeastern, Harvard, Stanford and others: six autonomous AI agents were adversarially tested for two weeks in a live environment with email, Discord and shell access by 20 researchers. Ten out of eleven scenarios revealed critical vulnerabilities: unauthorized data disclosure, infrastructure destruction, resource infinite loops, identity spoofing and external prompt injection. AgentHouse addresses these through ACLs, HITL, owner override, audit logs and the Policy Manager and Decision Manager applications.