Google debuts Gemma 3, claims its the most capable AI model to run on a single GPU

Gemma 3 open model will offer out-of-the-box support for over 35 languages and pretrained support for over 140 languages with a 128k-token context window. It also has the ability to analyse images, text, and short videos

Vikas SN

March 12, 2025 / 14:51 IST

Gemma models have been downloaded over 100 million times, and the developer community has created more than 60,000 Gemma variants to date.

Google on March 12 unveiled the next generation of its open model, Gemma 3, as the race to dominate the rapidly-evolving artificial intelligence (AI) sector intensifies.

Gemma 3, a collection of lightweight open models, has been built from the same research and technology that powers its flagship Gemini 2.0 AI models, the tech giant said. These models have been designed to run fast, directly on devices — from phones and laptops to workstations — helping developers create AI applications.

The company claimed that Gemma 3 is the most capable model one can run on a single graphics processing unit (GPU ) or tensor processing unit (TPU) , outperforming Meta's Llama-405B, DeepSeek-V3 and OpenAI's o3-mini in preliminary human preference evaluations on LMArena’s leaderboard.

Google introduced the Gemma family of open models in February 2024 as part of its strategy to attract developers and researchers to its AI offerings and compete with Meta's Llama, which also provides open AI models.

The company said these models have been downloaded over 100 million times, and the developer community has created more than 60,000 Gemma variants to date.

Gemma 3 integrates with developer tools such as Hugging Face Transformers, Ollama, JAX, Keras, PyTorch and others. Developers can access Gemma 3 through Google's free free web-based developer tool AI Studio, or download the model from Hugging Face or Kaggle. One can request access to the Gemma 3 API through AI Studio.

The launch comes in the backdrop of Chinese AI lab DeepSeek claiming to have built AI models that can rival top-tier models from Google, and other US companies such as OpenAI and Meta at a fraction of the cost. The launch earlier this year caused fresh concerns among investors over the billions of dollars being poured in by tech companies to develop their AI models and products.

In February, Sundar Pichai, the CEO of Google parent firm Alphabet, however, argued that the search giant's Gemini Flash 2.0 and Flash Thinking 2.0 models are "some of the most efficient models" out there, including compared to DeepSeek's V3 and R1.

"I think part of the reason we are so excited about the AI opportunity is we know we can drive extraordinary use cases because the cost of actually using it is going to keep coming down, which will make more use cases feasible. And that's the opportunity space. It's as big as it comes. And that's why you're seeing us invest to meet that moment" Pichai said during the company's earnings conference call.

Alphabet plans to invest around $75 billion in capital expenditures in 2025 to bolster its AI efforts. The investment will be made towards building out technical infrastructure, primarily for servers, followed by data centers and networking.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Vikas SN covers Big Tech, streaming, social media and gaming industry

first published: Mar 12, 2025 02:51 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

Google debuts Gemma 3, claims its the most capable AI model to run on a single GPU

Gemma 3 open model will offer out-of-the-box support for over 35 languages and pretrained support for over 140 languages with a 128k-token context window. It also has the ability to analyse images, text, and short videos

Related Stories

Subscribe to Tech Newsletters

Trending news