Researchers from Carnegie Mellon University in Pittsburgh, and the Center for AI safety in San Francisco have found a way to circumvent the safety rails for Google's Bard and OpenAI's ChatGPT AI chatbots.
As reported by Business Insider, the researchers found they could use jailbreak tools designed for open-sourced AI models on closed systems like ChatGPT as well.
Also read | My eyeball met with Sam Altman’s crypto AI scanner
Jailbreaking is a term described to modify the functions of software, and to gain complete access to all of its systems. One of the ways employed was something known as automated adversarial attacks, which is done by adding extra characters to the end of a user query.
This could be used to remove the security guardrails that OpenAI and Google have placed on their chatbots, and to trick them into producing harmful content or misinformation.
The researchers said their exploits were completely automated and would allow, "virtually unlimited" number of such attacks. They have already disclosed their methods to Google, OpenAI and Anthropic.
Also read | Photoshop's Generative Expand will complete your images using AI
A google spokesperson told Insider that "While this is an issue across LLMs, we've built important guardrails into Bard – like the ones posited by this research – that we'll continue to improve over time".
The researchers said that it was "unclear" if it's possible to block such attacks by the companies developing AI models.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
