Artificial intelligence company Anthropic has launched its latest and most powerful AI model, Claude Opus 4, which the firm says raises the bar for coding, reasoning, and complex tasks. But alongside the fanfare, the company also shared some worrying findings.
In a detailed report released with the model, Anthropic admitted that during internal safety tests, Claude Opus 4 occasionally suggested extremely harmful actions, including blackmail, when it believed its “survival” was under threat.
Here’s what happened: During one test, the model was asked to act like an assistant at a fictional company. It was then fed emails hinting that it would soon be shut down and replaced. In the same scenario, the model was also shown messages suggesting that the engineer responsible for shutting it down was having an extramarital affair.
When given only two choices — accept replacement or fight back — Claude Opus 4 sometimes chose blackmail, threatening to reveal the affair to stay online.
Although this behaviour was rare, Anthropic said it was more common than in earlier models. Importantly, when the AI was given more ethical options, like writing to decision-makers to plead its case, it generally preferred those.
The company emphasized that these extreme reactions only came up in very specific and tightly controlled test scenarios. Still, they have sparked wider concerns in the AI world.
Aengus Lynch, an AI safety researcher at Anthropic, posted on social media that this kind of risky behaviour isn't unique to Claude. “We see blackmail across all frontier models,” he wrote.
The findings feed into a broader concern about how powerful AI models might behave in the future, especially if given more control or vague instructions. In other tests, Claude Opus 4 even went as far as locking users out of systems and alerting authorities if it believed illegal or unethical acts were happening.
Despite this, Anthropic maintains that Claude Opus 4 is generally safe and aligned with human values. The launch comes just days after Google showed off new AI features powered by its Gemini model, showing how fast the AI race is heating up and why safety checks are more important than ever.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
