Nvidia's leadership in artificial intelligence (AI) chips is under new pressure as the computing emphasis in AI shifts from model training to inference, a shift fuelled by Chinese start-up DeepSeek and its groundbreaking R1 model, as reported by the Financial Times.
Inference, whereby AI models respond to user requests, is turning into a key requirement in AI computing. The trend is unlocking the door for Nvidia's competitors, such as start-ups Cerebras and Groq, and technology giants Google, Amazon, Microsoft, and Meta, to disrupt Nvidia's dominance.
“The opportunity right now to make a chip that is vastly better for inference than for training is larger than it has been previously,” said Andrew Feldman, CEO of Cerebras.
The growing inference market
Nvidia has dominated large-scale AI model training facilities, such as Elon Musk’s xAI project and OpenAI’s Stargate collaboration with SoftBank. However, as AI applications expand, the demand for smaller data centres focused on inference is rising.
Vipul Ved Prakash, CEO of Together AI, spoke to the increasing significance of inference. "I think doing inference at scale is going to be the largest workload on the internet at some point," he stated.
More than 75% of future computing and power demand in US data centres will be inference-driven, say Morgan Stanley analysts. Capital spending on inference in "frontier AI" is expected to grow from $122.6 billion in 2025 to $208.2 billion in 2026, says Barclays.
Though Barclays forecasts Nvidia to be the king of AI training with "essentially 100% market share," it says Nvidia will address just 50% of inference computing in the long run—keeping almost $200 billion in prospective chip spending on the table for rivals by 2028.
DeepSeek and the push for efficiency
DeepSeek's R1 and v3 models have hastened the move to inference with increased efficiency and reduced costs. Such advancements set off a stock market buzz earlier this year, indicating the rapid pace at which the industry is changing.
"The cost of using a specific amount of AI drops around 10x every 12 months, and lower cost means a lot more usage," OpenAI CEO Sam Altman said.
Inference tasks, which require greater memory for processing longer and more complex queries, have also opened the market to alternatives beyond Nvidia’s general-purpose GPUs. Feldman noted that Cerebras’ chips, used by French AI start-up Mistral, have demonstrated significant performance improvements.
“We are producing answers for Le Chat in sometimes a second while [OpenAI’s] o1 would have taken 40,” Feldman said.
Nvidia’s response and market outlook
Even with the competition, Nvidia insists that its chips are still strong for training and inference. CEO Jensen Huang pointed out that Nvidia's new Blackwell chips were created for better inference performance.
"The level of inference compute required is already 100x higher than it was when large language models began," Huang said. "And that's just the beginning."
Nvidia cites a 200-fold increase in inference performance over the last two years and claims that its architecture is still flexible and broadly applicable to AI workloads.
"Our architecture is fungible and easy to use in all of those different ways," Huang added.
Some industry executives, however, envision opportunities for specialized inference accelerators. Feldman highlighted the need for speed, stating, "Even microseconds [of delay] reduce the attention of the viewer."
But Prakash of Together AI—whose firm has Nvidia as an investor—added that flexibility is a major strength in a rapidly changing field. "The one benefit of general-purpose computing is that while the model architectures are shifting, you just have more flexibility," he said.
A complex future for AI chips
The increasing need for effective, inference-oriented chips is transforming the AI semiconductor industry. Although Nvidia's wide-ranging strategy remains in the lead, niche-focused start-ups and Big Tech companies are pursuing niche opportunities aggressively.
With AI applications growing and becoming more diversified, industry observers expect a multi-player, complex battle for inference computing dominance.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!