How AI model is being used to stop the creation of nuclear weapons

Anthropic, in collaboration with the US government, has created an AI-powered classifier that detects and blocks nuclear weapons-related queries, aiming to prevent AI misuse in national security threats.

Curated by : Pragya Trivedi

August 28, 2025 / 13:36 IST

How US built new tool to stop AI from making nuclear weapons (AI generated image)

In the rapidly evolving world of Artificial Intelligence, while its benefits are significant, it also brings certain drawbacks. AI has the potential to transform society in profound ways, both positive and negative. Many envision it helping to cure diseases, extend human lifespans, tackle climate change, and unlock the mysteries of the universe.

Steps are now being taken to safeguard AI models, aiming to prevent their use as tools for developing nuclear weapons.

An AI start-up backed by Amazon and Google has developed a new tool to prevent its AI from being used for nefarious purposes, such as building a nuclear bomb. Anthropic’s Claude is a direct competitor to OpenAI’s ChatGPT.

Anthropic stated that it has been collaborating with the National Nuclear Security Administration (NNSA) for more than a year to develop a “classifier” capable of stopping “concerning” conversations, such as instructions on how to build a nuclear reactor or bomb, on its AI system.

How did it unfold?

The new AI model, Claude, is developed to detect and block “potentially concerning conversations about nuclear weapons development.”

This acts like an email spam filter, which identifies real-time threats by analyzing user interactions with impressive accuracy. The 'classifier' correctly flagged 94.8% of queries related to nuclear weapons, though it mistakenly classified 5.2% of harmless queries as dangerous according to the company.

This technology is already integrated into some Claude models to prevent misuse. Anthropic emphasizes the importance of monitoring AI as it grows more advanced to avoid the risk of providing dangerous technical knowledge that could threaten national security.

“As AI models become more capable, we need to keep a close eye on whether they can provide users with dangerous technical knowledge in ways that could threaten national security,” Anthropic has said.

Earlier this month, Anthropic announced it would offer its Claude AI model to the U.S. government for just $1 (about Rs 87), joining other AI startups in proposing affordable deals to secure federal contracts.

This offer followed the addition of Claude, OpenAI’s ChatGPT, and Google’s Gemini to the U.S. government’s approved AI vendor list. CEO Dario Amodei highlighted the importance of providing government institutions with access to “the most capable, secure AI tools available.”

Anthropic also warns about new risks linked to its advanced AI features. Tools like Computer Use, which lets Claude control a user’s machine, and Claude Code, which integrates the chatbot directly into a developer’s terminal, open up new possibilities for misuse.

These capabilities, the company notes, could lead to increased abuse, malware creation, and cyberattacks.

This updated policy shows how AI companies are under growing pressure to prevent their models from being exploited for harmful purposes.

By specifically addressing some of the world’s most dangerous weapons and highlighting cybersecurity risks, Anthropic is positioning itself as a leader in responsible AI development, staying ahead of regulators and potential bad actors alike.

In a move to promote industry-wide safety, Anthropic plans to share its findings with the Frontier Model Forum, an AI alliance it co-founded with major players like Amazon, Meta, OpenAI, Microsoft, and Google. This collaboration aims to help other companies build similar safeguards.

first published: Aug 22, 2025 02:31 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

How AI model is being used to stop the creation of nuclear weapons

Anthropic, in collaboration with the US government, has created an AI-powered classifier that detects and blocks nuclear weapons-related queries, aiming to prevent AI misuse in national security threats.

Subscribe to Tech Newsletters

Trending news