Meta bets on India's AI talent with Llama 3.1, Hindi support

Llama 3.1 is touted as the first "frontier-level" open-source AI model capable of matching the performance of leading proprietary foundation models from OpenAI, Google, and Anthropic.

Vikas SN

July 26, 2024 / 12:01 IST

Meta is now allowing developers to use synthetic data outputs from Llama models to create derivative models or train other models

As competition in the artificial intelligence (AI) space heats up with players like OpenAI and Google, Facebook parent Meta is turning to India's expanding AI developer base.

Meta's newly released Llama 3.1 model, introduced on July 25, includes support for Hindi and seven other languages. It will be offered in three variants: a flagship model with 405 billion parameters, along with updated versions of existing 8 billion and 70 billion parameters.

Parameters are essentially the 'knowledge' the model acquires during its training, with more parameters typically leading to better performance due to increased contextual understanding.

Ragavan Srinivasan, Meta's vice president of product management, told Moneycontrol that languages are a crucial element for developers building on the company's base model, with Hindi being the first of many languages.

"Ultimately, we want to provide a strong foundation in a bunch of different languages and then give that as a base for local developers to customise it and fine-tune it based on their needs," Srinivasan said.

Llama 3.1 is touted as the first "frontier-level" open-source AI model capable of matching the performance of leading proprietary foundation models from OpenAI, Google, and Anthropic. According to Meta, the flagship model even outperforms these competitors on certain benchmarks, while the smaller models are competitive with both closed and open models of similar parameter counts.

Srinivasan said that an open-source model is essential for developers, particularly in diverse markets like India, as it provides the flexibility to address various nuances and cultural customisation needs.

Developers will be able to create specific use cases, products, and applications not only from a consumer perspective but also to serve public sector or government needs, without worrying about being locked into a proprietary vendor or cloud provider, he added.

The new model took several months to train. To create datasets in local languages, the company partnered with Indian AI startups like Lightspeed and Peak XV-backed Sarvam.AI, in addition to utilising publicly available content and collaborations with various vendors. Sarvam.AI is building a series of AI models called OpenHathi, which are based on the Llama model.

"A large part of what goes into the (foundational) model is general knowledge, you want the base models to be able to be good at concepts like mathematics, physics, reasoning, and logic. Those tend to transcend languages because they're universal constructs and then you use the language training data to make sure the model is able to understand the questions in different languages," Srinivasan said.

New licensing terms

Another key change Meta introduced with the new model is its licensing terms. Developers can now use synthetic data generated by Llama models to create derivative models or train other models. In fact, the company said that the updated Llama 70B and 8B models are themselves derivatives of the Llama 405B model.

"We think of the Llama 405B model as the teacher model that you can use to distill and generate synthetic data and create custom smaller models that are optimised for your use case. We believe this measure gives a lot of flexibility to developers" Srinivasan said.

Following this model's release, Meta plans to invest significant efforts towards developer outreach and awareness building initiatives.

"This is a big area of focus for me and my team. Now that we have the most powerful model available for everyone, we have to invest a lot of time in helping the developer community understand its power and take advantage of it," Srinivasan said.

Meta is working with over two dozen ecosystem partners, including Microsoft Azure, Amazon Web Services, Google Cloud, Groq, Nvidia, Scale.AI, Dell, and Databricks, to accelerate the adoption of Llama 3.1 among developers. This collaboration will empower developers to fine-tune and develop their own models, ultimately driving enterprise adoption.

On July 25, Meta announced a partnership with National Association of Software and Service Companies (Nasscom), India's IT industry body, to launch an open-source generative AI challenge for startups and developers. The Centre for Development of Advanced Computing (C-DAC), under the Ministry of Electronics and Information Technology (MeitY), will serve as the technology partner, providing compute infrastructure support.

"The startup community in India is incredibly vibrant. So there is a huge amount of growth that will come from this community," Srinivasan said.

The government and public sector will be another crucial area of growth for Llama adoption in the country, he added.

Meanwhile, rival Google is also wooing Indian AI developers with a suite of products, tools, and access to its latest models, Moneycontrol reported on July 17. The American tech giant released Gemma 2, the next generation of its open-source AI model, to all Indian developers and expanded the context window of its flagship Gemini 1.5 Pro model to 2 million tokens.

Llama Stack

In the coming months, Meta plans to introduce tools, reference guides, how-to guides, and APIs to simplify the developer experience in utilising Llama models to create solutions and products for both domestic and global markets.

The social networking giant is also building 'Llama Stack', a set of standardised interfaces and APIs for developers building on top of Llama foundation models.

"If you think about how Linux evolved, first it was Linux and then you had the LAMP (Linux, Apache, MySQL, and PHP) stack. Similarly, Llama Stack will make it easy for developers to start getting up and running as quickly as possible" Srinivasan said. LAMP stack is a popular open-source software stack that developers use to build and maintain websites and web applications," Srinivasan said.

According to Srinivasan, the company is also working on adding more capabilities including multimodal ones to the Llama model.

"As Llama starts to become multimodal, then you will be able to have richer product experiences built on top of it," he said.

Srinivasan further added "I believe Indian developers will want the best price performance. They will also want capabilities like speech so that you can actually talk to these models. That will unlock a much bigger market for them."

Llama 3.1 is also powering Meta's artificial intelligence chatbot Meta AI that has rolled out support for seven new languages including Hindi and Hindi-Romanized Script (or Hinglish).

The company extended Meta AI to India in June, months after it was rolled out in more than a dozen countries, including the United States, Australia, Canada, New Zealand, and Singapore.

Users will be able to access the flagship Llama 405B model on Meta AI's web app and WhatsApp in the United States, the company said.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Vikas SN covers Big Tech, streaming, social media and gaming industry

first published: Jul 26, 2024 12:01 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

Meta bets on India's AI talent with Llama 3.1, Hindi support

Llama 3.1 is touted as the first "frontier-level" open-source AI model capable of matching the performance of leading proprietary foundation models from OpenAI, Google, and Anthropic.

Related Stories

Subscribe to Tech Newsletters

Trending news