Apple is evolving its generative artificial intelligence (AI) strategy by using a technique dubbed Recurrent Drafter (ReDrafter) to accelerate large language models (LLMs) inferencing for Nvidia GPUs. The ReDrafter was used by Apple earlier this year as a means of generating text with LLMs. Moreover, this technique was combined with Nvidia’s TensorRT-LLM acceleration framework to achieve improved performance.
Apple uses Nvidia platform for boosted AI performance
In a new blog post, Apple detailed its collaboration with NVIDIA to gain a new method for generating text with LLMs that is significantly faster and “achieves state-of-the-art performance.” Further, the blog states that this has been achieved with ReDrafter, which Apple has confirmed “combines beam search with dynamic tree attention to speed up LLM token generation by up to 3.5 tokens per generation step for open-source models, surpassing the performance of prior speculative decoding techniques.”
Further, the results from various benchmarking tests indicate an impressive performance boost, showing a 2.7x increase in token generation speed in a high parameter with the updated framework. Moreover, the new integrated tool will benefit developers using Nvidia GPUs to achieve faster token generation with lower latency and computational costs.
As of now, specific pricing details for accessing ReDrafter have not been revealed by Apple. However, this partnership with Nvidia suggests that this technology will likely be integrated into future AI-related products and services from the company for both developers and consumers.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.