ChatGPT competitors: Amazon jumps into fray with generative AI better than GPT-3.5

When OpenAI launched ChatGPT just over two months ago, it set in motion a chatbot mega-race with contenders like Microsoft, Google, Baidu, and Amazon.

Nivash Jeevanandam

February 12, 2023 / 15:23 IST

Amazon has released a new language model that outperforms GPT-3.5 on ScienceQA by 16 percentage points, or 75.1% to 91.6% accuracy. Will Amazon set a standard and enter the chatbot competition? (Representational image: Tobias Dziuba via Pexels)

OpenAI launched ChatGPT to the public just over two months ago, immediately shoving the AI-powered chatbot into the centre of mainstream discourse, with debates about how it could alter business, education, and more.

Then, tech giants Google and Baidu, based in China, launched their chatbots to show the public that their so-called "generative AI" (technology that can make conversational text, graphics, and more) is also ready for prime time.

Also read: Google parent Alphabet loses $100 billion in market value after AI chatbot Bard gives wrong answer

Now, on the ScienceQA benchmark, Amazon's new language models do better than GPT-3.5 by 16 percentage points (75.17%) than GPT-3.5, and even outperform many humans.

The ScienceQA benchmark is a large set of multimodal science questions with annotated answers. It has over 21,000 multimodal multiple-choice questions (MCQs).

Amazon researchers came up with Multimodal-CoT, which combines visual features in a separate training framework, to reduce the effects of these mistakes. The framework breaks the reasoning process into two parts: finding a reason and figuring out the answer. The model makes more convincing arguments by including the vision in both stages. In addition, it helps to draw more accurate conclusions about the answers. It is the first work of its kind to look at how CoT reasoning works differently. On the ScienceQA benchmark, the technique, as provided by Amazon researchers, demonstrates state-of-the-art performance, outperforming GPT-3.5 accuracy by 16 percentage points and surpassing human performance.

How does it outperform?

The inference and reasoning-generating stages of the Multimodal-answer CoT use the same model architecture but differ in the inputs and outputs. In the rationale generation stage of a vision-language model, for example, the model is fed data from both the visual and language domains. Then, once the rationale has been made, it is added to the initial language input in the answer inference step to make the language input for the next stage.

Simply put, the text of the language is put into a Transformer encoder to make a textual representation. Then, this textual and visual representation are put together and fed into the Transformer decoder.

Evaluation

To see how their method worked, the researchers ran many tests on ScienceQA. The researchers concluded that their method does 16 percent better on the benchmark than the previous state-of-the-art GPT-3.5 model.

In a nutshell, Amazon researchers looked into and solved the problem of eliciting Multimodal-CoT reasoning by proposing a two-stage framework for combining vision and language representations with running Multimodal-CoT. So, the model provides practical reasons to help figure out the final answers.

Conclusion

The Amazon researchers demonstrate in their study that using visual features helps develop more effective rationales, which contribute to more accurate answer inference.

Using Multimodal-CoT, they demonstrate that 1B-models outperform GPT-3.5 on the ScienceQA benchmark by 16 percent. Their mistake analysis suggests that there is potential in future studies to leverage more effective visual features, infuse common sense information, and apply filtering processes to improve CoT reasoning.

Already, industry giants are researching to establish a standard for chatbot advancement. Amazon has now entered the fray. Other companies need to stand up; these competitions will undoubtedly lead the way for the best solution and product. Let's see what happens.

Check out the Paper and Github.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Nivash Jeevanandam is a senior research writer at INDIAai (Govt. of India) - National AI Portal of India | NASSCOM. Views expressed are personal.

first published: Feb 12, 2023 03:06 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

ChatGPT competitors: Amazon jumps into fray with generative AI better than GPT-3.5

When OpenAI launched ChatGPT just over two months ago, it set in motion a chatbot mega-race with contenders like Microsoft, Google, Baidu, and Amazon.

Related stories

Subscribe to Tech Newsletters

Trending news