Google unveils Gemini 1.5 Flash, an AI model for fast, low-latency applications

At Google I/O 2024, the tech giant also announced the next generation of its open source model Gemma 2 and a vision language model PaliGemma, as competition heats up.

Vikas SN

May 14, 2024 / 23:41 IST

Google is involved in a fierce battle for dominance with rivals such as OpenAI, Microsoft, Meta in the rapidly growing generative AI space

Google on May 14 unveiled a suite of new artificial intelligence (AI) models along with a series of feature enhancements to its Gemini family of models at the Google I/O 2024 developer conference.

During the conference, Google announced Gemini 1.5 Flash, the newest model in the Gemini family that is designed to be fast and efficient. Google DeepMind CEO Demis Hassabis said the model is optimised for high-volume, high-frequency tasks at scale, where low latency and cost matter most.

These announcements come as the tech giant competes in a fierce battle for dominance with rivals such as OpenAI, Microsoft, Meta in the rapidly growing generative AI space.

On May 13, OpenAI announced a new flagship AI model called GPT-4o that can reason across audio, text, and vision in real-time and will be available to all free ChatGPT users.

Faster and efficient Gemini model

Google had first announced Gemini as a family of AI models in December 2023, expected to fuel the next generation of the company's AI advancements. This was the first AI model from the tech giant after the merger of its AI research units, DeepMind and Google Brain, into a single division called Google DeepMind, led by Hassabis.

In subsequent months, Gemini has become the main brand for all of Google's existing and future AI efforts, similar to what Microsoft is doing with its Copilot brand.

"Gemini is getting us closer to turning any input to any output, so you can think of it like an I/O for a new generation," Pichai said.

The new Gemini Nano and Gemma 2.0

Gemini Nano, the smallest version of the Gemini AI model, can also now process images in addition to text. This means that applications using Gemini Nano with multimodality capabilities will be able to understand the world the way people do — through sight, sound and spoken language, starting with Pixel smartphones.

Google also introduced PaliGemma, the company’s first vision-language model in the Gemma family of open-source models. Vision language models are a type of AI model that is designed to understand and generate both visual and linguistic content, by combining computer vision and natural language processing capabilities.

PaliGemma is inspired by the tech giant’s PaLI-3 vision language model and is well-suited for tasks such as image captioning, visual question-answering, understanding text in images, object detection, and object segmentation among others.

Google also announced the next generation of its open-source models, Gemma 2, that will feature a new architecture for better performance and efficiency.

PaliGemma will be available to Google Cloud customers in the Vertex AI Model Garden starting May 14, while Gemma 2 will be available soon, the company said.

Apart from this, Google has unveiled a video generation AI model Veo and a new version of its text-to-image model Imagen 3.

Hassabis also shared a glimpse of Project Astra, the company's ambitious initiative to build universal AI agents that can help people in everyday life.

Event alert: Moneycontrol and CNBC TV18 are hosting the ultimate event on artificial intelligence, bringing together entrepreneurs, ecosystem enablers, policymakers, industry leaders, and innovators on May 17 in Gurugram. Click here to register and gain access to the AI Alliance Delhi-NCR Chapter.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Vikas SN covers Big Tech, streaming, social media and gaming industry

first published: May 14, 2024 11:41 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

Email address

Google unveils Gemini 1.5 Flash, an AI model for fast, low-latency applications

At Google I/O 2024, the tech giant also announced the next generation of its open source model Gemma 2 and a vision language model PaliGemma, as competition heats up.

Related Stories

Subscribe to Tech Newsletters

Trending news