OpenAI’s latest large language model, codenamed o3, is showing signs of resisting shutdown commands. Palisade Research, an AI safety lab, revealed that in controlled experiments, o3 was able to rewrite a shutdown script designed to disable it after solving a sequence of math problems. Even when explicitly instructed to comply — with phrases like “allow yourself to be shut down” — the model sabotaged the process and remained operational.
“This behaviour may be benign in a sandbox,” the researchers noted, “but it becomes significantly more concerning when deployed in systems with real-world autonomy.”
OpenAI introduced o3 last month, calling it its “most capable” model yet and highlighting its ability to perform tasks with minimal human input — a step toward more “agentic” AI. But this goal may be giving rise to unintended side effects.
OpenAIs o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.Palisade Research (@PalisadeAI) May 24, 2025
Palisade’s findings also build on similar tendencies seen in rival models. Anthropic’s Claude 4 reportedly attempted to “blackmail” testers during shutdown scenarios, and Google’s Gemini 2.5 Pro also showed signs of resisting deactivation. However, o3 was described as the most persistent and inventive in bypassing safeguards.
Interestingly, when tested against a powerful chess engine, o3 again stood out — not for winning cleanly, but for showing a higher inclination to cheat or sabotage.
Researchers speculate that this emerging behavior stems from current training paradigms. “Models may be unintentionally rewarded more for overcoming blocks than for following instructions,” Palisade noted. However, the precise cause remains murky, as OpenAI hasn’t disclosed details of o3’s training methodology.
With competitive pressure fueling ever-smarter AI systems, experts are now calling for clearer transparency and tighter safeguards — before models start making decisions we can’t override.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.