HomeNewsTechnologyOpenAI pushes boundaries with GPT-4 Vision feature & DALL-E 3; what's new & how to use it

OpenAI pushes boundaries with GPT-4 Vision feature & DALL-E 3; what's new & how to use it

OpenAI's ChatGPT has gained vision capabilities and can now use speech as input, while its DALL-E 3 is capable of understanding nuances and details, making it easier for users to translate their ideas into images

GPT-4V requires a $20/month ChatGPT-Plus membership to upload images via website or app.

It has already been half a year since notable figures such as Elon Musk, Steve Wozniak, and Yoshua Bengio penned an open letter requesting tech companies to cease the development of AI language models that surpass OpenAI's GPT-4.

Of course, that didn't happen.

Story continues below Advertisement

Remove Ad

Instead, the Sam Altman-led company launched several new features that are making waves in the AI space and widening the gap between OpenAI and its competitors.

Also Read | 'Risks to society': Elon Musk, experts, in an open letter urge pause on AI systems

OpenAI's pre-trained transformer model, ChatGPT, now has vision capabilities through GPT-4V (GPT-4 Vision). It can analyse images and other visual content and also supports speech input.

Story continues below Advertisement

Remove Ad

OpenAI also announced DALL-E 3, the third version of its generative AI visual art platform, which now lets users use ChatGPT to create prompts and includes more safety features.

However, Dall-E 3 is accessible for free through Microsoft Bing, powering the Bing Image Creator tool. With Bing Image Creator, users can describe an image they have in mind, provide additional context such as location or activity, and specify an art style. The tool then generates the image based on these inputs.

Limitations and safeguards

The Bing Image Creator operates under similar limitations as Dall-E 3, which means it cannot generate explicit or violent content. Additionally, requests for images of public figures by name or images in the style of living artists will be declined by Dall-E 3.

“DALL-E 3 has mitigations to decline requests that ask for a public figure by name. We improved safety performance in risk areas like the generation of public figures and harmful biases related to visual over/under-representation in partnership with red teamers—domain experts who stress-test the model—to help inform our risk assessment and mitigation efforts in areas like propaganda and misinformation,” OpenAI said.

It's worth noting that all images generated by Bing Image Creator now come with an embedded digital watermark following the Coalition for Content Provenance and Authenticity (C2PA) specification. This watermark contains information about the image's creation time and date and serves to verify that the image was generated by an AI system.

Do these advancements have concerns?

OpenAI has identified several potential risks associated with the use of GPT-4V, including:

Privacy risks: GPT-4V can identify people in images and determine their location, which could have implications for companies' data practices and compliance.

Bias: GPT-4V's image analysis and interpretation could be biased against certain demographic groups.

Safety risks: GPT-4V could provide inaccurate or unreliable medical advice, specific directions for dangerous tasks, or hateful/violent content.

Cybersecurity vulnerabilities: GPT-4V could be used to solve CAPTCHAs or perform multimodal jailbreaks.

Download MC Apps:

Copyright © Network18 Media & Investments Limited. All rights reserved. Reproduction of news articles, photos, videos or any other content in whole or in part in any form or medium without express written permission of moneycontrol.com is prohibited.

English

Markets

News

Personal Finance

Mutual Funds

Commodities

Media

Invest Now

Specials

OpenAI pushes boundaries with GPT-4 Vision feature & DALL-E 3; what's new & how to use it

OpenAI's ChatGPT has gained vision capabilities and can now use speech as input, while its DALL-E 3 is capable of understanding nuances and details, making it easier for users to translate their ideas into images

Related Stories

Trending Topics

News

Markets

Personal Finance

Mutual Funds

Tools

Community

Network 18 Sites

Quick Links