
OpenAI has introduced GPT-5.4, describing it as the company’s most capable and efficient frontier model built for professional and enterprise tasks.
The new release expands the GPT-5 series with multiple variants. Alongside the standard model, OpenAI is offering GPT-5.4 Thinking, a reasoning-focused version designed for complex problem solving, and GPT-5.4 Pro, which prioritises higher performance.
A key highlight is the model’s massive context window. The API version supports up to one million tokens, allowing developers to process far larger documents and datasets in a single request than previous OpenAI models.
OpenAI says the new model is also significantly more token-efficient. According to the company, GPT-5.4 can solve similar problems using fewer tokens compared with GPT-5.2, potentially reducing both latency and cost for developers.
Benchmark results show major gains across several tests. GPT-5.4 recorded top scores in computer-use benchmarks OSWorld-Verified and WebArena Verified, while achieving 83 percent on OpenAI’s GDPval evaluation, which measures performance on knowledge-work tasks.
The model also led the APEX-Agents benchmark from Mercor, which evaluates professional skills such as legal reasoning and financial analysis.
Mercor CEO Brendan Foody said GPT-5.4 performed strongly on complex deliverables including slide decks, financial modelling and legal analysis, while operating faster and at a lower cost than competing frontier models.
OpenAI also claims improved factual reliability. In internal evaluations, GPT-5.4 was 33 percent less likely to make errors in individual claims compared with GPT-5.2, while overall responses were 18 percent less likely to contain mistakes.
New Tool Search
The company has also introduced a new system called Tool Search to improve how models interact with external tools through the API. Previously, system prompts needed to include definitions for every available tool, which could consume large numbers of tokens. Tool Search allows the model to retrieve tool definitions only when required, reducing token usage and speeding up requests in applications with many integrated tools.
OpenAI also introduced a new safety evaluation focused on chain-of-thought reasoning — the internal step-by-step explanations models generate when solving complex tasks. Researchers have raised concerns that AI models might misrepresent this reasoning process under certain conditions.
According to OpenAI, early testing shows deception is less likely with GPT-5.4 Thinking, suggesting the model is less capable of hiding its reasoning. The results indicate that monitoring chain-of-thought behaviour remains an effective safety method.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.