Alibaba unveils new open-source multimodal AI model: All the details

Alibaba has introduced Qwen2.5-Omni-7B, a unified multimodal AI model designed to process and generate text, images, audio, and video. Part of the Qwen series, this 7-billion parameter model is optimized for deployment on edge devices like smartphones and laptops, said the China-based tech giant.

Despite its compact size, Qwen2.5-Omni-7B delivers strong multimodal capabilities, making it suitable for various applications, including real-time voice assistance and intelligent customer service interactions. It can assist visually impaired users by providing real-time audio descriptions, analyse cooking videos for step-by-step guidance, and enhance interactive AI conversations. “This unique combination makes it the perfect foundation for developing agile, cost-effective AI agents that deliver tangible value, especially intelligent voice applications,” said Alibaba.

Story continues below Advertisement

Remove Ad

The model is now open-sourced on Hugging Face and GitHub, with additional access via Qwen Chat and Alibaba Cloud’s ModelScope, said the company. Alibaba Cloud has previously open-sourced over 200 generative AI models, added the company

Key features of the model

English

Markets

News

Personal Finance

Mutual Funds

Commodities

Media

Invest Now

Specials

Alibaba unveils new open-source multimodal AI model: All the details

Part of the Qwen series, this 7-billion parameter model is optimized for deployment on edge devices like smartphones and laptops, said the China-based tech giant.

Related Stories

Trending Topics

News

Markets

Personal Finance

Mutual Funds

Tools

Community

Network 18 Sites

Quick Links