As the ubiquity of AI increases in our daily lives and parlance, an aspect of this transformation which has received scant attention is the cost associated with it all. While AI took the computing world decades to develop, its widespread deployment was not impeded by technological obstacles alone. Instead, a very significant aspect of why it took so long to publicly deploy powerful generative AI systems is the humongous cost associated with training and deploying them.
OpenAI’s GPT-4, Google’s PaLM (Pathways Language Model), and even the relatively smaller LLaMA model by Meta are expensive to train and deploy. And these costs are multi-dimensional: No easy fixes can bring it down dramatically in the short-term. Given the role AI is going to play in our lives it is important to explore the nature of these costs and the direction of future AI development.
Expensive To Train
The first major cost component to consider is hardware. This, most significantly, includes GPUs (Graphics Cards), essential for training and deploying AI models. Just one of Nvidia’s A100 cards, which are some of the most common GPUs for such tasks, is priced north of $10,000. And given the scale of computation required, firms need clusters of tens of thousands of these GPUs to train and deploy their biggest models.
Just for context, to train the massive (yet already dated) GPT-3 model containing 175 billion parameters, one of these cards would take around 288 years of computation. Therefore, OpenAI used several thousand of these to achieve their computational goals.
In fact, some estimates suggest that OpenAI’s GPU requirement of A100 cards could exceed 30,000 for commercialising ChatGPT, costing a whopping $30 million. Similarly, the hardware cost to Microsoft for deploying ChatGPT’s AI capability into Bing could be over $4 billion, and for Google, it could be around $100 billion.
Pricey To Run As Well
The second cost dimension is the running/inference cost. These are much larger than we have been used to, for our regular Google searches and queries. To contextualise, Alphabet Chairman John Hennessy told Reuters, that an exchange with a Large Language Model (LLM) like ChatGPT costs ten times more than a standard keyword search.
Now, if a 10x cost surge occurs over hundreds of millions or billions of daily queries, then even basic running costs could become unsustainable for all but the richest firms. SemiAnalysis’s recent estimate that OpenAI spends around $700,000 per day to run ChatGPT substantiates this assertion.
The Other Costs
AI engineers are offered some of the highest take-home salaries across sectors, with Glassdoor regularly reporting compensations over a million dollars annually. Further, the talent pool is small, and there is fierce competition among firms to mop up the best talent for themselves. This combination of scarcity and competition has ensured that skilled talent demands a high premium.
Then there is the environmental cost. This is something that we have only recently realised as being quite steep. The global tech sector accounts for around 1.8-3.9 percent of GHG emissions, and AI, which is just one sub-sector within tech, has shown a disproportionate contribution towards emissions relative to its size.
To contextualise the emission output and energy profile of AI, we can see that GPT-3 emitted over 500 tonnes of CO2 for its training, Meta’s OPT emitted 75 tonnes, and a single query to ChatGPT could cost 100 times more energy than a standard Google Search. With net-zero commitments looming, such steep emission numbers simply cannot be ignored.
There is also the cost of R&D, gathering and cleaning data, securing electricity, operational costs, legal fees, lobbying costs, etc.
Thus, in every way that one looks at it, large AI models are very expensive. And the size and costs of these large model AIs have only been trending upwards, as the push for larger models with greater accuracy, performance, and knowledge continues.
Impact Of High Costs
Due to the steep cost profile of large AI systems, the first implication is that the public almost never gets access to the latest and greatest. This is because high quality, larger size of output, and lower latency for processing requests, are all antithetical to mass adoption. Lack of complete integration of the latest AI models into Bing and Google’s search is reflective of the financial limitations of even the biggies like Microsoft and Alphabet.
Secondly, not-for-profit business models have become unviable. For training the largest foundational models, only firms with deep pockets and a clear vision towards monetisation will survive. The decision of OpenAI to become for-profit, instead of remaining open-source, speaks to this fact.
Third, the direction of research has changed. Currently, there are various bottlenecks in AI training and deployment, such as memory constraints, bandwidth constraints, hardware limitations etc., with many of these areas seeing scant research hitherto. However, as the fight for lower cost and latency surges, formerly untouched areas are seeing strong research thrusts.
For instance, several major players have already decided to make their own AI chips, such as Google’s Tensor Processing Unit, Amazon’s Inferentia and Trainium, Meta’s MTIA, etc. all of which are specifically optimised for AI training. Novel research on new forms of computer memory to improve energy efficiency and performance are also getting released regularly. Software too is getting optimised constantly. Thus, AI research is now much more strongly guided by industry’s interests than it has been in the past. Stanford’s latest annual AI report also reiterates this point.
Where To, From Here?
Overall, AI costs are high and trending upwards. Various benefits that come with making models larger, including greater accuracy and more generalised intelligence, have given impetus to this trend.
However, as nothing can increase in scale indefinitely, firms are bound to draw their line somewhere. It will be interesting to see where they decide to do so, as their decisions will help us understand where the point of diminishing returns has been reached.
Further, through how companies act, we’ll also learn of new ways of monetising larger AI models, and the plans to beat the growing competition that smaller but competent AI models are increasingly giving. For investors and interested observers alike, these are pertinent queries which they’ll be interested in knowing the answers to.
Srimant Mishra is a computer science engineer from VIT university, Vellore, with a deep interest in the field of Artificial Intelligence. He is currently pursuing a law degree at Utkal University, Bhubaneshwar. Views are personal, and do not represent the stand of this publication.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.