HomeNewsTechnologyBeyond Q-Star: PPO, the hidden gem in OpenAI's arsenal

Beyond Q-Star: PPO, the hidden gem in OpenAI's arsenal

Proximal Policy Optimization, or PPO, is helps train computer models to make decisions in complicated or simulated situations

November 24, 2023 / 13:42 IST
Story continues below Advertisement
AI
OpenAI uses PPO in different situations, like teaching computer programmes in simulated environments or getting better at challenging games.

The recent boardroom upheaval at ChatGPT creator OpenAI, which sent ripples through the tech industry and beyond, seems to have stemmed from a letter sent by some researchers before the shock ouster of co-founder and CEO Sam Altman.

The letter raised concerns about a potential AI breakthrough that could pose risk to humanity. The decision to remove Altman, who returned as CEO within five days of sacking, was reportedly influenced by this letter.

Story continues below Advertisement

OpenAI is said to be working on a project called Q* (pronounced Q-Star), which has the capability to solve unfamiliar math problems. It is believed that Q* could mark a significant advancement in OpenAI's pursuit of artificial general intelligence (AGI), which is defined as autonomous systems outperforming humans in economically significant tasks.

Q* was able to solve some math problems given its enormous computational capacity.