Moneycontrol PRO
Loans
Loans
HomeTechnologyDeepSeek's LLM success triggers big debate: Is India's hesitation a strategic mistake?

DeepSeek's LLM success triggers big debate: Is India's hesitation a strategic mistake?

The success of DeepSeek’s latest R1 LLM has sparked a debate of whether India is late in setting out to build its own foundational AI models, what the nation needs to do next, to remains self-reliant in computing systems.

January 28, 2025 / 18:17 IST
Deepseek AI AP

Representative image

 
 
live
  • bselive
  • nselive
Volume
Todays L/H
More

A heated debate has been sparked on whether India should build use cases on top of existing Large Language Models (LLM) versus building foundational models, this time by a little-known Chinese firm DeepSeek, that has taken the world of artificial intelligence by storm. Experts are questioning if India’s hesitation in building a foundational LLM could be a strategic blunder.

Released last week, DeepSeek’s R1 LLM has already outperformed OpenAI's ChatGPT and other advanced LLMs on most benchmarks, aside of triggering a global selloff in AI-related shares over fears of competition and investment review.

The biggest casualty was Graphics Processing Unit (GPU) maker Nvidia Corp, whose shares plunged as much as 13 percent on Nasdaq on January 27, wiping out $465 billion worth of wealth, its largest ever.

Experts back home are debating if it was a mistake not pursuing India's own LLM, with China having firmly entered the global AI race and challenging the dominant US players.

Both countries have a complicated history— the US with its ability to stifle competition in strategic sectors through regulation, and China with its lack of transparency.

To be sure, India has built a couple of LLMs, albeit on a much smaller scale. While DeepSeek has 671 billion parameters, India's Sarvam-1 LLM is a two-billion parameter one, specifically optimised for Indian languages.

Indian startups like Sarvam and Krutrim are making significant strides in the development of LLMs tailored to the local market and diverse languages.

AI research and development startup Sarvam, which raised around $41 million in 2023, focuses on building language models that can understand and generate multiple Indian languages, catering to the country’s multilingual demographic.

Similarly, Ola's founder Bhavish Aggarwal-led Krutrim is working on creating robust, region-specific AI models that enhance the capabilities of businesses to interact with customers in their native languages, integrating contextual and cultural understanding.

However, experts point out that we need '10s and 100s of Sarvam-like' companies to take over this race.

"Sarvam is just one company, and while it's a strong start, we need at least 10 more like it. With a country of 1.4 billion people, it's essential to have multiple players in this space. Just as we don’t rely on a single company for infrastructure like roads, AI and LLMs should have a similar breadth of innovation and competition. It's a fundamental need for India’s future." Soni said.

It is not that India does not want to build its own LLM, the question is whether India is late to the party of impactful global LLMs.

“We will be doing that (building LLM). We are working on getting the GPU design with our IP rights owned by our country in the next 3 to 5 years, our own foundation model,” Union Minister for Electronics and IT (MeitY), Ashwini Vaishnaw, had told Moneycontrol on January 23, at the sidelines of the World Economic Forum in Davos, Switzerland.

However, he added that the major focus will be on the solution space.

He concurs with what many technocrats believe should be the way ahead for India. Vaishnaw’s comments echo the views of industry veterans like Infosys founder NR Narayana Murthy and co-founder Nandan Nilekani.

Nonetheless, many technocrats and industry experts differ.

“India doesn't wish to be just a trade colony of China or technology colony of the US. What makes AI different is that it needs a whole-of-nation approach involving deep-tech startups, enabling industrial policy and pre-commercial publicly-funded research. Only when all three come together magic can happen,” Indian software products industry think tank iSPIRT Foundation's co-founder, Sharad Sharma, told Moneycontrol.

Similarly, Umakant Soni, Chairman of AI Foundry, Cofounder ARTPARK called for strategic independence in AI, and called it India's 'DeepSeek moment’.

"This is a ‘DeepSeek moment’ for India, an opportunity to create competitive, open, and cost-effective AI solutions, leveraging our vast developer talent pool,” he said. "Imagine if our trains, air traffic control, and strategic systems were being decided by AI—we cannot rely on someone who can sanction us or double tariffs overnight."

Global AI pioneers like Yoshua Bengio highlighted the geopolitical implications of AI. "Countries that invest in building their AI systems will have a geopolitical advantage," Bengio told Moneycontrol, adding that there’s a need for governments to fund and incentivise foundational model development. He even warned that reliance on foreign AI systems would possible mean leaving nations vulnerable in the strategic sector.

Founder of DeepLearning.AI Andrew Ng also weighed in on the subject, saying that India has the potential to lead in both training foundational models and developing applications.

DeepSeek’s Model Explained

DeepSeek’s models utilise a 'Mixture-of-Experts' mode. What worked wonders for DeepSeek is its architecture which combines a smaller number of models working together. Although it has 671 billion parameters, only 37 billion are active at any given time, advisory firm Bernstein said in a report.

This architecture is coupled with a number of other innovations including Multi-Head Latent Attention, which substantially reduces required cache sizes and memory usage.

Coupled with other innovations, this results in it performing better on numerous benchmarks such as language, coding, and mathematics, while requiring a fraction of the compute resources to train.

For example, DeepSeek’s V3 model required around 2.7 million GPU hours using 2,048 H800 GPUs to pre-train, with only around 9 percent of the compute required to pre-train the open-source similarly-sized LLaMA 405B model.

“(DeepSeek’s models are) ultimately producing as good or (in most cases) better performance on a variety of benchmarks. And DeepSeek R1 (reasoning) performs roughly on par with OpenAI’s o1 model,” Bernstein analysts Stacy Rasgon, Alrick Shaw, and Arpad von Nemes, were quoted as saying in the report.

Arvind Srinivasan, founder of Perplexity AI, recently took to X to clarify some misconceptions about China’s AI developments, particularly surrounding DeepSeek’s R1 model.

One of the biggest misunderstandings, according to Srinivasan, is the claim that China simply  "cloned" OpenAI's outputs. He argues that this view is based on an incomplete understanding of how LLMs are trained. DeepSeek R1, he points out, has made significant strides in Reinforcement Learning (RL) fine-tuning—an area crucial to AI model improvement.

He added that DeepSeek’s ability to self-host the model provides companies with the option to keep data within their jurisdiction and this approach to AI development challenges the traditional understanding of AI training and deployment.

Srinivasan emphasizes that it is an important step forward in the global AI race.

Satya Gupta, President of VLSI Design and also the Member of National Committee on Electronics Manufacturing, believes that India should choose smart techniques like DeepSeek to build models and just go the brute force way.

“I have a lot of respect for Nandan Nilekani and everybody, but I believe that we really need people to build this model, but choose smart techniques to build this model,” Satya Gupta told Moneycontrol. He added that India needs both hardware and models, too, “because tomorrow we become good at this and we don't have the access to GPUs. Suppose it's banned for India, then what?”

"Come up with this... Which could be GPU and create a smart way of training the model, which takes much less energy, much less compute power, just like what we've seen with DeepSeek,” Gupta said. However, he urged the government to make funds available not just to academia but to product companies too. “You cannot just do it on your own or just from VC (venture capital) money. Government has to step in and say, this is important for the country and we will fund such activities to the industry, not just to academia.”

IT Industry’s Take

The Indian information technology (IT) sector is of the view that the key opportunity lies in training LLMs using vast amounts of data, rather than developing these models from scratch.

Speaking to Moneycontrol at the World Economic Forum in Davos, Tata Consultancy Services Chief Executive Officer K Krithivasan said, the strength of LLMs come primarily from the ability to train them with the vast amount of data.

“We'll be able to help our customers train these models, fine-tune those models, and new paradigms like agentic AI come into play. And how do you orchestrate this model? This is where our core strength comes from,” Krithivasan said.

Even Wipro says it will not get into the race of building LLMs or Small Language Models (SLMs). “As an organisation, I don't think we want to build LLMs or SLMs to that extent, because we want to leverage what is there. We've got partners through which we're going to do that,” Wipro’s Chief Executive Officer Srinivas Pallia said on January 17 after declaring its third quarter results.

While Wipro’s cross-town rival Infosys has built four SLMs and is building more of these, it is silent on foundational models.

Need for a 'Whole-of-Nation' Approach

Sharad Sharma of iSPIRIT argues that India must pivot from being a 'use-case capital' to a leader in foundational AI development.

“Yes, we have lost some time due to the use-case capital camp. But all is not lost. The field is still young and many areas like neurosymbolic AI are very much open,” Sharma added. He pointed at the success of China’s DeepSeek R1 and Kimi K1.5, achieved through a whole-of-nation approach, combining government support, deep-tech startups, and pre-commercial research.

“India doesn’t wish to be a trade colony of China or a technology colony of the US. What makes AI different is that it requires a whole-of-nation approach. Our resistance to this approach is understandable, given that our IT services and SaaS industries grew without it. But AI demands a new playbook.”

Sharma draws parallels to India’s success with cryogenic engines, 4G/5G telecom equipment, and the India Stack, asserting that when India focuses its collective efforts, the results are transformative.

Another AI industry expert highlighted the urgency of the situation. “China’s entry into the LLM game with DeepSeek is a wake-up call for India—hesitation in building foundational AI models today could cost us strategic autonomy tomorrow. In the race for AI dominance, every delay compounds the gap,” said an industry expert who spoke to Moneycontrol on condition of anonymity.

The sentiment is shared by Bhaskar Majumdar, Managing Partner of Unicorn India Ventures, who believe that India’s AI strategy must evolve.

“The DeepSeek moment reminds us that accepting the status quo is a wrong strategy. India is heavily reliant on Nvidia GPU-powered solutions, but significant disruptions are on the horizon with power-efficient chips that are high on performance. Anticipating and preparing for this change will benefit India’s solutions community" Majumdar said.

Learning from Global Examples

The Biden Administration’s AI Diffusion Executive Order underscores the geopolitical stakes in AI dominance, with restrictions aimed at limiting the export of cutting-edge GPUs to countries like China. Meanwhile, China’s success with DeepSeek highlights the benefits of state-backed, large-scale investments in AI infrastructure.

Umakant Soni draws a parallel with India’s success stories in other sectors, saying, “With the right investments, we could create an AI ecosystem that mirrors the success of UPI or ISRO. India has the talent and cost-effectiveness to build competitive AI solutions at scale. We need ten DeepSeek-like companies because the scale of our population, 1.4 billion people, demands it.”

What Are VCs Saying

Narendra Bhandari, General Partner at Seafund, a venture capital that invests in DeepTech and AI startups said that he sees an untapped opportunity for Indian startups to develop AI models at lower costs.

“DeepSeek and other newer models have demonstrated a fantastic opportunity for India. The government should prioritize funding capital build-outs, attracting global talent, and balancing research funding between universities, research labs, and private investors,” he said.

Bhandari also stressed the importance of accelerating the adoption of Indian-designed computing systems to ensure the ecosystem remains competitive and self-reliant.

Some investors say that ‘accepting the status quo is a wrong strategy’.

“The DeepSeek moment is a reminder that accepting the status quo is a wrong strategy. All of us will witness significantly power efficient, high performance, solution centric chips in the next few years that will shake the likes of NVIDIA. Anticipation of this disruption will benefit India and its solutions community,” Unicorn India Ventures’ Bhaskar Majumdar said.

Many are of the opinion that indigenous LLMs are not just about technological advancement—they are critical for addressing India's unique linguistic diversity and providing industries with localised solutions.

“If India does not invest in building its own LLMs, it risks becoming a mere consumer in the global AI economy, losing both digital sovereignty and economic opportunities,” Abhivyakti Sengar, Senior Analyst, Everest Group.

To sum up, every nation and every geography is looking to build AI supremacy and India is no different. And one of the prerequisites of building AI supremacy is nuanced and bespoke LLMs which is nothing but ensuring that the tapestry and diversity of the country are maintained in terms of the data set.

“One of the implications if India doesn’t build LLMs will be the AGIs or the AI systems not firing robustly and the efficiency and effectiveness will not be built,” said Sameer Dhanrajani, CEO, AIQRATE & 3AI.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Reshab Shaw Covers IT and AI
Bhavya Dilipkumar
first published: Jan 28, 2025 05:39 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Advisory Alert: It has come to our attention that certain individuals are representing themselves as affiliates of Moneycontrol and soliciting funds on the false promise of assured returns on their investments. We wish to reiterate that Moneycontrol does not solicit funds from investors and neither does it promise any assured returns. In case you are approached by anyone making such claims, please write to us at grievanceofficer@nw18.com or call on 02268882347