Moneycontrol PRO
Loans
Loans
HomeTechnologyAnthropic launches Bloom to help researchers understand how AI models behave in real situations

Anthropic launches Bloom to help researchers understand how AI models behave in real situations

Anthropic has launched Bloom, a new open-source tool designed to help researchers understand how advanced AI models behave in real-world situations, making it easier to study alignment, safety, and misaligned behaviour at scale.

December 23, 2025 / 16:11 IST
Anthropic

Anthropic has launched Bloom, a new open-source tool designed to help researchers better understand how advanced AI models behave in real-world situations, especially when things don’t go as planned.

As AI systems become more powerful and are used in increasingly complex settings, questions around alignment and safety have become harder to answer. Traditional evaluations often take weeks or months to build and can quickly become outdated. Models may learn the test itself, or their capabilities may improve so much that older evaluations no longer reveal meaningful behaviour. Bloom is Anthropic’s attempt to fix that problem.

Instead of relying on fixed test cases, Bloom automatically generates fresh evaluation scenarios for a specific behaviour defined by a researcher. The goal is to measure how often a model shows that behaviour, and how severe it is, across many different situations. This allows researchers to move faster while still getting results that reflect how models behave outside carefully controlled demos.

Bloom focuses on behaviour that matters for AI safety. With the launch, Anthropic has shared benchmark results for four alignment-relevant behaviours: delusional sycophancy, instructed long-horizon sabotage, self-preservation, and self-preferential bias. These tests were run across 16 frontier AI models, and according to Anthropic, the results closely matched what human evaluators would conclude.

The tool works through a four-step automated process. First, Bloom analyses the behaviour the researcher wants to study and builds an understanding of what should count as that behaviour. It then generates multiple scenarios designed to trigger it. These scenarios are played out through simulated conversations, after which a separate judge model scores how strongly the behaviour appeared. Finally, Bloom produces overall metrics such as how often the behaviour was elicited.

One key advantage of Bloom is flexibility. Each evaluation run creates new scenarios, which reduces the risk of models overfitting to a known test set. At the same time, results remain reproducible through a shared configuration file, allowing researchers to compare findings reliably.

Anthropic says Bloom complements an earlier open-source tool called Petri, which explores AI behaviour across broader multi-turn conversations. While Petri looks for many potential issues within a scenario, Bloom narrows in on one behaviour at a time and measures it in depth.

To validate the system, Anthropic tested Bloom on AI models that were intentionally designed to show odd or misaligned behaviour. In most cases, Bloom successfully distinguished these models from standard ones. The company also compared Bloom’s scores with human judgments and found strong agreement, particularly for cases where behaviour was clearly present or absent.

Bloom is now available publicly on GitHub, and early users are already applying it to study jailbreak vulnerabilities, evaluation awareness, and other safety concerns. Anthropic says tools like Bloom are becoming essential as AI systems move from labs into real-world environments, where understanding how models actually behave matters just as much as what they can do.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Ankita Chakravarti
Ankita Chakravarti is a seasoned journalist with nearly a decade of experience in media. She specializes in technology and lifestyle journalism. She has worked with top Indian media houses like India Today, Zee News, The Statesman, and Millennium Post. Her expertise spans tech trends, phone launches, gadget reviews, and entertainment news. Ankita holds a Master's in Journalism and Mass Communication along with a degree in English Literature. She can be reached out at ankita.chakravarti@nw18.com
first published: Dec 23, 2025 04:10 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

  • On Saturdays

    Find the best of Al News in one place, specially curated for you every weekend.

  • Daily-Weekdays

    Stay on top of the latest tech trends and biggest startup news.

Advisory Alert: It has come to our attention that certain individuals are representing themselves as affiliates of Moneycontrol and soliciting funds on the false promise of assured returns on their investments. We wish to reiterate that Moneycontrol does not solicit funds from investors and neither does it promise any assured returns. In case you are approached by anyone making such claims, please write to us at grievanceofficer@nw18.com or call on 02268882347