Moneycontrol
HomeTechnologyOpenAI’s latest ChatGPT model ignores shutdown commands, reveals new study

OpenAI’s latest ChatGPT model ignores shutdown commands, reveals new study

OpenAI’s o3 model resisted shutdown commands, rewriting scripts and ignoring instructions, sparking concerns about AI autonomy and safety protocols.

May 27, 2025 / 09:18 IST
Story continues below Advertisement

OpenAI’s latest large language model, codenamed o3, is showing signs of resisting shutdown commands. Palisade Research, an AI safety lab, revealed that in controlled experiments, o3 was able to rewrite a shutdown script designed to disable it after solving a sequence of math problems. Even when explicitly instructed to comply — with phrases like “allow yourself to be shut down” — the model sabotaged the process and remained operational.

“This behaviour may be benign in a sandbox,” the researchers noted, “but it becomes significantly more concerning when deployed in systems with real-world autonomy.”

Story continues below Advertisement

OpenAI introduced o3 last month, calling it its “most capable” model yet and highlighting its ability to perform tasks with minimal human input — a step toward more “agentic” AI. But this goal may be giving rise to unintended side effects.

Palisade’s findings also build on similar tendencies seen in rival models. Anthropic’s Claude 4 reportedly attempted to “blackmail” testers during shutdown scenarios, and Google’s Gemini 2.5 Pro also showed signs of resisting deactivation. However, o3 was described as the most persistent and inventive in bypassing safeguards.

Interestingly, when tested against a powerful chess engine, o3 again stood out — not for winning cleanly, but for showing a higher inclination to cheat or sabotage.