Moneycontrol PRO
Swing Trading 101
Swing Trading 101

MC EXPLAINER Inside Sarvam AI’s rapid-fire launches ahead of the India AI Impact Summit

From live AI dubbing of the Union Budget to speech, vision, and state-backed compute infrastructure, Sarvam AI is rolling out a broad stack of products in the run-up to the India AI Impact Summit.

February 09, 2026 / 16:38 IST
Sarvam's product blitz is likely to culminate with the launch of its first foundational AI model under the IndiaAI Mission, ahead of the India AI Impact Summit scheduled for February 16–20.
Snapshot AI
  • Sarvam AI launches new models for vision, speech, dubbing, and chat agents
  • AI dubbing used for Union Budget speech in multiple Indian languages
  • Sarvam partners with Odisha, Tamil Nadu for sovereign AI infrastructure projects

Peak XV and Lightspeed-backed Sarvam AI, which has emerged as a key player in India’s sovereign artificial intelligence (AI) push, is in the middle of a tightly packed two-week product blitz to showcase its latest AI models, platforms, and infrastructure plans.

The daily product ‘drops’, which began on February 1, span vision, speech recognition, dubbing, conversational agents, and text-to-speech. These releases are likely to culminate with the launch of its first foundational AI model under the IndiaAI Mission, just ahead of the India AI Impact Summit scheduled for February 16–20.

The rollout is reminiscent of OpenAI’s ‘12 Days of OpenAI’ rollout in December 2024, which featured a mix of updates including new models, features, and product enhancements across a 12-day period.

Why is this product rollout crucial for Sarvam?

For Sarvam, which also counts Khosla Ventures among its backers, the rollout underscores its mission to build a full-stack AI system designed to address the needs of India's highly diverse population. It also signals the growing maturity of the country's homegrown AI capabilities amid a fierce global race for AI dominance.

The rollout so far has attracted attention from global investors and AI experts. Deedy Das of Menlo Ventures noted in a recent post that while he was previously skeptical of Sarvam’s focus on Indic language models, he now views the startup's speech and optical character recognition (OCR) systems as among the strongest built for Indian languages.

"They're filling a well needed gap in the ecosystem and doing things big labs will probably never focus on to the fullest extent," he said.

Highlighting this shift in sentiment, IT Minister Ashwini Vaishnaw remarked in a post on X (previously Twitter) that the country’s sovereign model strategy is delivering results "Even the most critical reviewers are praising the technologically advanced model released by Sarvam as a part of our AI mission," he said.

"In parallel, our smart young engineers are working on innovations in materials science, healthcare and cybersecurity that will be noticed by the world as pathbreaking models," he added.

Here is a breakdown of Sarvam’s announcements so far.

1. Sarvam Vision for document intelligence

What was announced

Sarvam unveiled Sarvam Vision, a 3-billion-parameter vision language model focused on document digitisation and understanding, with a strong emphasis on Indian languages.

How it works

Sarvam Vision combines a sovereign vision language model with two additional components: a semantic layout parser and a reading order network. The system was trained on a mix of synthetic and real-world documents across English and 22 Indian languages. These include government documents, financial records, textbooks, newspapers, magazines, and historical manuscripts.

What is unique

Sarvam noted that the model approaches document intelligence as a knowledge extraction problem rather than simple text retrieval. It is designed to interpret tables, charts, and visual structures end-to-end.

The company also released the Sarvam Indic OCR (Optical Character Recognition) Bench, a dataset covering 22 Indian languages, to provide a standardised benchmark for Indic OCR beyond English-focused global datasets.

Sarvam Vision benchmark

Sarvam claimed that its Vision model is the best model by far in terms of "word accuracy" in Indian languages, outperforming Google's Gemini 3 Pro, Anthropic's Claude Opus 4.5.

On X, Sarvam co-founder Pratyush Kumar shared some examples of digitising content from old Tamil books, complex multi-column layouts of an old Malayalam newspaper, and Hindi educational content with rich formatting and images with text in captions.

2. Sarvam Audio for speech recognition

What was announced

Sarvam introduced Sarvam Audio, an audio language model for speech recognition across Indian languages. The company said in its blogpost that the model outperforms Gemini-3 and GPT-4o Transcribe on a range of benchmarks.

How it works

Sarvam Audio is an extension of Sarvam’s 3-billion-parameter language model trained from scratch on English and 22 Indian languages. Instead of treating speech as a simple audio-to-text task, the model treats speech as a contextual signal. This allows it to process long conversations, multi-speaker audio, overlapping speech, and noisy environments with greater reliability.

The system also allows applications to control the transcription format at inference time, supporting different styles such as code-mixed transcription, which reflects how Indian users naturally speak, often switching between regional languages and English in a single sentence.

What is unique

Sarvam Audio uses conversational history and domain context to resolve ambiguity in speech. It can also directly extract intent and parameters from audio, enabling speech-to-command functionality without passing through a separate transcription and interpretation layer. The company said this reduces latency and simplifies voice-agent system design.

3. Sarvam Dub and live multilingual dubbing

What was announced

Sarvam said it enabled live AI-powered dubbing of the Union Budget speech by the Finance Minister Nirmala Sitharaman on live television, making it available in multiple Indian languages with a latency of under two minutes. The company claimed this marked the first time a national budget speech was dubbed live using AI.

How it works

The dubbing feature is powered by Sarvam Dub, an AI dubbing model built to translate speech into another language while retaining the original speaker’s voice characteristics. The model focuses on preserving speaker similarity so that tone, cadence, and voice identity remain consistent even after translation. This allows viewers to consume the speech in their preferred language without losing the familiarity of the original speaker.

What is unique

Sarvam stated that the most difficult part of live dubbing is reducing latency when the source text is not available in advance. To address this, the company optimised its model and serving pipeline to achieve a 6.6x reduction in latency over a base implementation.

Sarvam claimed this model has outperformed global rivals such as ElevenLabs and Cartesia in the "speaker similarity" metric.

Sarvam also highlighted that the system is already deployed at scale for recurring public communication, including dubbing the Prime Minister’s Mann Ki Baat address into 11 Indian languages every month.

The startup has also worked with IIT Madras to demonstrate dubbing for educational content.

4. Bulbul V3 for natural and production-ready Indian voices

What was announced

Sarvam launched Bulbul V3, the latest version of its text-to-speech model, aimed at delivering natural and production-ready voices for Indian languages.

How it works

Bulbul V3 uses a language model to analyse text and infer speech elements such as tone, pauses, emphasis, and pacing. This enables it to generate speech that sounds more natural and expressive, including for long-form content and conversational use cases.

It also supports voice cloning, allowing teams to create custom voices that maintain natural expressiveness and quality.

What is unique

The model offers over 35 voices across 11 languages, sourced from professional voice artists and is built to handle code-switching, regional accents, numerics, and names, which are common challenges in Indian speech. It will soon expand to 22 Indian languages.

Sarvam said Bulbul V3 was evaluated in an independent third-party blind listening study across 11 languages and showed high listener preference and low error rates.

The startup claimed that this model outperformed global rivals such as ElevenLabs's V3 Alpha model and Cartesia's Sonic-3 models. To drive adoption, Sarvam is offering unlimited API access to developers through this month.

5. Samvaad conversational agents platform

What was announced

Sarvam said its conversational agents platform, Samvaad, now handles over a million minutes of interactions every day.

How it works

Samvaad powers AI-driven conversations across channels such as phone calls and WhatsApp. The platform is used for customer support, onboarding, sales, and large-scale public outreach. These agents operate continuously and can be deployed in hybrid workflows that combine automation with human escalation where required.

What is unique

Sarvam said it has observed that some agents, once they reach sufficient scale and reliability, generate feedback data that helps improve them further. This creates a rapid reinforcement loop, allowing these agents to scale faster over time. The company described such deployments as “rocketship” agents.

6. State partnerships for sovereign AI infrastructure

What was announced

Sarvam announced strategic partnerships with the governments of Odisha and Tamil Nadu to build sovereign AI infrastructure.

How it works

In Odisha, Sarvam is collaborating with the state to build a 50 MW AI-optimised compute facility that will function as an AI public utility. The facility is expected to support applications such as e-governance, healthcare triage, agriculture advisories, unified citizen helplines, AI tutors, and disaster management.

In Tamil Nadu, the partnership will take shape through Digital Sangam, a sovereign AI research park developed with the state government and IIT Madras. At its core will be a 20 MW AI-optimised data centre supporting research and governance workloads.

What is unique

Sarvam noted that these partnerships follow a top-down approach, with a state-wide mandate to deploy AI across departments rather than relying on isolated pilot projects. The company positioned the initiatives as a way to build long-term institutional and infrastructure capacity for AI-led governance.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Bhavya Dilipkumar
Vikas SN
Vikas SN covers Big Tech, streaming, social media and gaming industry
first published: Feb 9, 2026 04:38 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

  • On Saturdays

    Find the best of Al News in one place, specially curated for you every weekend.

  • Daily-Weekdays

    Stay on top of the latest tech trends and biggest startup news.

Advisory Alert: It has come to our attention that certain individuals are representing themselves as affiliates of Moneycontrol and soliciting funds on the false promise of assured returns on their investments. We wish to reiterate that Moneycontrol does not solicit funds from investors and neither does it promise any assured returns. In case you are approached by anyone making such claims, please write to us at grievanceofficer@nw18.com or call on 02268882347