Sarvam AI, the Artificial Intelligence (AI) startup launched to build Indian language foundational models, is gearing up to launch its first commercial voice-to-voice endpoint tool which will help businesses across voice-related functions like customer support, said cofounder Vivek Raghavan.
“You can expect some voice-to-voice endpoints with at least 10 Indian languages and you can expect some experiences built on this and also example experience for people to use and build on top of it,” Said Raghavan during a session addressing SaaS founders on March 7 at the SaaSBoomi annual event in Chennai.
Startups and businesses that are looking to build or use voice experiences as part of their services can tap into this tool, especially in Indian languages, he added.
The firm recently released OpenHathi-Hi-v0.1, the first Hindi large language model (LLM) in the OpenHathi series. The model is built on Meta AI's Llama2-7B architecture, and according to Sarvam AI, it delivers performance on par with GPT-3.5 for Indic languages.
During the address at SaaSBoomi, Raghavan said that OpenHathi has shown improved performance within English to Hindi translation when compared to GPT-4 and GPT-3.
Raghavan also spoke about the challenges the firm is currently facing in building Indian language foundational model including difficulty in data collection and token costs.
“There are challenges in data collection, quality data to be collected and the tokenisation cost is more for Indian languages. There are also evaluation challenges for something new like what we are building,” he added while speaking at the SaaSBoomi event.
SaaSBoomi was Founded in 2015 as an informal group of SaaS founders looking to network and learn from each other. The community now hosts more than 800 companies.
IIT Madras’ research lab AI4Bharat on March 6 launched IndicVoices, an open-source natural and speech dataset, covering 22 Indian languages.
The mission of this dataset was to collect spontaneous speech of Indian languages, said AI4Bharat said in a blog. IndicVoices is funded by Bhashini, which is backed by the Ministry of Electronics and Information Technology, Ekstep Foundation, and Nilekani Philanthropies.
"This is something very good for the ecosystem. AI4 Bharat's data will be a high-quality one for startups. That said there is more work to be done when it comes to building a foundational model," Raghavan said.
Sarvam was the first Indian AI startup to raise $41 million in its Series A funding round led by Lightspeed Ventures with participation from Peak XV Partners and Khosla Ventures in December of 2023.
Founded in July 2023 by Vivek Raghavan and Pratyush Kumar, who previously worked at Infosys co-founder Nandan Nilekani-backed AI4Bharat, Sarvam develops a full-stack offering for Generative AI, ranging from research-led innovations in training custom AI models to an enterprise-grade platform for authoring and deployment.
Raghavan, an entrepreneur and technologist who was instrumental in building Digital Public Goods (DPGs) like Aadhaar, said that Sarvam will work with Indian enterprises to co-develop domain-specific AI models leveraging their data.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.