India’s efforts to build its large language model (LLM) got a significant push this month with the release of Sarvam-M, a 24-billion parameter open-weights AI model from Bengaluru-based startup Sarvam AI, one of the key players selected under the IndiaAI Mission.
But the release also sparked a debate: is this truly a breakthrough or just repackaged hype? Here's what you need to know.
What is Sarvam-M?
Sarvam-M (M stands for Mistral) is a “hybrid” language model, meaning it’s trained on a blend of tasks such as math, programming, reasoning and Indian language understanding. It is built on top of the open-source Mistral Small model and fine-tuned for Indian use.
The model is designed to power applications such as chatbots, translation systems and educational tools and is available through Sarvam’s API and on Hugging Face.
"Sarvam-M represents an important stepping stone on our journey to build Sovereign AI for India," co-founder Vivek Raghavan said on X, formerly Twitter.
What’s under the hood?
According to Sarvam’s technical blog, the model underwent three core optimisation steps:
Supervised Finetuning (SFT): Sarvam curated a high-quality dataset by selecting English prompts and translating about one-third of them into Indian languages to improve the model’s performance in cultural contexts. These translations spanned 10 major Indic languages , including Hindi, Bangla, Tamil and Marathi. The translations were handled by Llama 3.1 8B models trained specifically for this task.
Each prompt was rendered in three forms: native script, Romanised, and code-mixed (blending Indic and English), reflecting real-world language usage patterns. The dataset included a mix of coding, math, reasoning, and general-purpose prompts, with the Indic language samples making up 30 percent of technical prompts and 50 percent of the rest.
Reinforcement Learning with Verifiable Rewards (RLVR): The model was trained to perform better over time on specific tasks like math or programming by providing it with reward signals, positive or negative feedback, when it gave correct or incorrect responses. A curriculum of tasks, prompt sampling strategies and group-based optimisation methods like Group Relative Policy Optimisation (GRPO) was used to improve performance over time, especially in math, programming and multilingual settings.
Inference Optimisation: Sarvam-M includes enhancements for deployment such as post-training quantisation (reducing model size without losing accuracy) and dynamic serving configurations. It supports FP8 quantisation, a way to make the model faster and cheaper to run, while maintaining accuracy.
Any real gains?
Yes, technically. On various benchmark tests for math, programming and Indic languages, Sarvam-M outperforms models of a similar size, including Meta’s Llama 4 Scout (17B), and even holds up against larger models like Llama 3.3 (70B) and Google’s Gemma 3 (27B) in specific categories.
For example:
- Over 86 percent improvement in math performance when questions are in Indian languages (romanised)
- More than 21.6 percent gain in math benchmarks overall
- +20 percent improvement in Indian language understanding tasks
Is sovereign AI just a label?
The launch of Sarvam-M stirred a debate around India’s vision for “sovereign AI”. Some questioned the decision to build on top of Mistral, a French open-source model, instead of training one from scratch.
"Congratulations to the Sarvam team on their launch, they are doing amazing work. My confusion stems from the ask of the IndiaAI mission for an indigenous model. This isn't one, as it is built on top of Mistral," Srikanth Velamakanni, co-founder and CEO of Fractal AI told Moneycontrol.
He pointed to his company’s release, Fathom-R1-14B, a 14-billion parameter model launched last week. “It performs surprisingly well on benchmarks, beating OpenAI's o3-mini and o1-mini handsomely," he said.
"The IndiaAI Mission must clarify its expectations: is the goal to build a frontier model that beats benchmarks (albeit built on a pre-existing open source model), or is it to build a model entirely from the ground up?" Velamakanni said.
Despite its engineering heft, Sarvam-M got off to a slow start. Within three days of launch, the model had just 718 downloads on Hugging Face.
Deedy Das, a VC at Menlo Ventures, called the release “embarrassing”, saying, “No one is asking for a slightly better 24B indic model…I’m disappointed at their direction and expected them to accomplish more with their resources.” He compared it with a project by two Korean college students that hit 200,000 downloads.
In December 2023, Sarvam AI raised $41 million in a Series A funding round led by Lightspeed Ventures with participation from Peak XV Partners and Khosla Ventures.
But others defended Sarvam’s long-term play. “Pratyush Kumar, Vivek and team are going to be leaders in Indian AI because they put their heads down and execute. They are not 'thought leaders' who pass hot gas on Twitter. If there are no download numbers, even better,” Tarun Bhojwani, founder of cloop.ai told Moneycontrol.
“This means only Sarvam will execute on top of the open source Sarvam models and win by building for Bharat. Like every other research firm before them, they are going to look like fools, until one day they look like geniuses. You can call them one or the other on Twitter, and it doesn’t matter. They know what they’re trying and I hope they succeed,” he added.
Tarun Pathak, Research Director at Counterpoint Research, also underscored the strategic importance of Sarvam’s approach.
“True AI adoption in a diverse country like India requires models that don't just translate but genuinely understand and operate in various Indian languages due to the diverse nature of the country. We believe India is likely to generate tonnes of data that is relevant to local needs and preferences and we need models beyond frontier models for various reasons,” he said.
“Sarvam AI is building its models from the ground up, with a focus on Indian linguistic nuances, cultural contexts and local datasets, which is likely to open up a lot of end-use cases. We believe Sarvam has the potential to grow big in the coming years, and specialised open-source models make more sense than getting into the race of frontier models that are coming almost every other day,” Pathak added.
Why build on another model?
Sarvam applied its supervised finetuning and reinforcement learning steps on Mistral Small, a 24B parameter model licensed under Apache 2.0.
“We found the base Mistral Small model could be significantly improved in Indian languages,” the company said in the technical report.
“For instance, for Hindi, the model lacked basic understanding of whole numbers and arithmetic. Thus, we decided to use about one-third of the samples to create completions in Indic languages. Specifically, we convert about 30 percent of the coding, math, and reasoning prompts, and 50% of other prompts to Indian languages,” it added.
How does it fit into India’s AI plans?
Sarvam’s ambitions extend beyond Sarvam-M. Earlier this month, the company released Bulbul, a multilingual voice AI model. It’s also working on a 70-billion parameter multimodal model (text + image + speech), which is intended to be fully homegrown and is being funded under the IndiaAI Mission.
Another initiative under the government-backed BharatGen project, Param 1 Indic Scale, was released recently, a 2.9B parameter bilingual model trained from scratch with 25 percent Indic data.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!