Inside the tech behind Modi-Lex podcast: ElevenLabs India head explains the breakneck translation speed

ElevenLabs has a global team of around 150 people, including 10 in India, and plans to expand its go-to-market team further.

Reshab Shaw

March 18, 2025 / 13:23 IST

Prime Minister Narendra Modi on a podcast with Lex Fridman.

The Modi-Fridman podcast made waves globally—not just for the high-profile conversation between Prime Minister Narendra Modi and AI researcher and podcaster Lex Fridman, but for the way the interview was AI-dubbed seamlessly into multiple languages, making it accessible to a much wider audience.

What really stood out was how quickly this was achieved, something that usually takes weeks being done in just a few hours.

The technology that made this possible was built by London-headquartered unicorn ElevenLabs, an AI voice synthesis company.

"The beauty of our model is that it is natively multilingual and understands context. It has emotion built into it," explained Siddharth Srinivasan, go-to-market leader for India at ElevenLabs. This means the AI doesn’t just translate, it feels the conversation, adds pauses and even subtle changes in tone, Srinivasan added.

ElevenLabs is backed by Sequoia Capital venture capitalist Andreessen Horowitz along with entrepreneurs Nat Friedman and Daniel Gross, among others.

The Modi-Fridman episode is available in English, Hindi and Russian, and will soon follow in other languages. But for now, Indian AI startup Sarvam AI’s co-founder Pratyush Kumar has released snippets of the podcast in nine Indian languages on X (formerly Twitter) and offered to share the full version with ElevenLabs.

Also read: AI is powerful but may never be able to match depth of human imagination: PM Modi

How ElevenLabs pulled it off

Speed was a major factor in this project. Traditionally, dubbing a conversation of this magnitude, with such high-profile figures, would typically take weeks. So how did ElevenLabs deliver?

The secret lies in their proprietary audio models. These aren’t just any speech synthesis models, they’re highly trained multilingual AI models that understand different languages, accents and contexts.

“The way the models are built, either through the voices we provide or the voices that you put into the platform or even the voices you synthesise, you literally have infinite voice possibilities. Because of the core part of the technology, what you're able to do is have these experiences, so that it doesn't seem like it's a translator but it comes across authentically in that person's voice,” Srinivasan said.

Nonetheless, all is not left to AI, and human oversight is needed.

“We run this with a human-in-the-loop process where technology enables things to happen... and then you have a really meticulous, strong editorial process,” he added. This combination of AI speed and human accuracy made sure that the dubbed version sounded authentic.

ElevenLabs employs around 150 people globally, Srinivasan said, which is spread out in multiple countries. In India, it has about 10 people working for it and the company is going deeper into the go-to-market team. “We should be aggressively expanding this year to build out the India market… India is definitely a must-win market for us, we are investing locally. We're ensuring that India is part of a lot of the core product programs and research work,” Srinivasan said.

Behind the tech

When asked about the technical foundation behind ElevenLabs’ voice models, the India head clarified that their approach is quite different. “Our models are actually audio models that are our own. So that’s our IP,” he said.

Rather than relying just on text-based models like LLMs or large language models, ElevenLabs has built specialised audio models designed specifically for speech synthesis and voice cloning.

Srinivasan also highlighted three core models: Multilingual V2, which handles media workflows with high accuracy; Flash 2.5 for conversational use cases with fast processing; and Scribe, their speech-to-text model that delivers extremely precise transcriptions.

“We're not the cheapest product, in fact typically we find ourselves at the higher end of the price spectrum, and that's because we do believe we're delivering a very high-quality product doing a lot of things that did not happen before,” Srinivasan said.

Also read: Chinese media's praise for PM Modi's remarks during Lex Friedman podcast

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Reshab Shaw Covers IT and AI

first published: Mar 18, 2025 01:19 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

Inside the tech behind Modi-Lex podcast: ElevenLabs India head explains the breakneck translation speed

ElevenLabs has a global team of around 150 people, including 10 in India, and plans to expand its go-to-market team further.

Related stories

Subscribe to Tech Newsletters

Trending news