HomeTechnologyAnthropic launches Bloom to help researchers understand how AI models behave in real situations

Anthropic launches Bloom to help researchers understand how AI models behave in real situations

Anthropic has launched Bloom, a new open-source tool designed to help researchers understand how advanced AI models behave in real-world situations, making it easier to study alignment, safety, and misaligned behaviour at scale.

December 23, 2025 / 16:11 IST
Story continues below Advertisement
Anthropic
Anthropic

Anthropic has launched Bloom, a new open-source tool designed to help researchers better understand how advanced AI models behave in real-world situations, especially when things don’t go as planned.

As AI systems become more powerful and are used in increasingly complex settings, questions around alignment and safety have become harder to answer. Traditional evaluations often take weeks or months to build and can quickly become outdated. Models may learn the test itself, or their capabilities may improve so much that older evaluations no longer reveal meaningful behaviour. Bloom is Anthropic’s attempt to fix that problem.

Story continues below Advertisement

Instead of relying on fixed test cases, Bloom automatically generates fresh evaluation scenarios for a specific behaviour defined by a researcher. The goal is to measure how often a model shows that behaviour, and how severe it is, across many different situations. This allows researchers to move faster while still getting results that reflect how models behave outside carefully controlled demos.

Bloom focuses on behaviour that matters for AI safety. With the launch, Anthropic has shared benchmark results for four alignment-relevant behaviours: delusional sycophancy, instructed long-horizon sabotage, self-preservation, and self-preferential bias. These tests were run across 16 frontier AI models, and according to Anthropic, the results closely matched what human evaluators would conclude.