The Rising Cost of Running Generative AI Models in Production

The emergence of generative AI has created a frenzy in the business world. Initially, it was seen as a novelty, but now its real impact is being felt. Tech giants like Microsoft (MSFT), Amazon Web Services (AWS), and Google are engaged in an “AI arms race” to establish dominance. Enterprises are quickly adapting to this technology to avoid being left behind or missing out on opportunities. Venture capitalists are fueling the rise of new companies powered by large language models (LLMs). However, along with the opportunities, challenges have also arisen.

Several challenges have surfaced as generative AI gains momentum. Ensuring model veracity and addressing bias have become significant concerns. The cost of training these models is another topic of discussion. Moreover, there are identity and security issues related to the misuse of models. Another debate triggered by generative AI is open-source versus closed-source. While open-source models offer lower deployment costs and greater accessibility, the technology to effectively deploy them in a viable manner is lacking.

One of the most pressing issues that demands attention is the cost of running large models in production. Generative models are complex, computationally intensive, and exceptionally large. As a result, they are much more expensive to run compared to other types of machine learning models. To illustrate this, imagine creating a home décor app where customers can visualize their rooms in different design styles. With model Stable Diffusion, this task can be achieved with some fine-tuning. Initially, the service charges $1.50 for 1,000 images, which may seem reasonable. However, if the app goes viral and attracts 1 million active daily users creating ten images each, the inference costs would amount to $5.4 million per year.

For companies relying on generative models or LLMs as the backbone of their applications, these inference costs have major implications for their pricing structure, growth plan, and overall business model. While training the AI application is a sunk cost by the time it launches, inference costs will continue indefinitely. This poses a sustainability challenge for many companies running these models.

Although proprietary models have made significant progress, open-source models are proving to be a potential alternative. Open-source models offer flexibility, performance, and cost savings. They have shown great promise and may be a viable option for emerging companies in the future. Instead of a winner-takes-all scenario, a hybrid approach using both proprietary and open-source models seems to be the way forward.

There is robust evidence that open-source models will play a crucial role in the proliferation of generative AI. Meta’s LLaMA 2 and LLaMA, for instance, have gained significant attention. These models can be retrained at a modest cost and offer instruction tuning capabilities. They can also be executed on various hardware devices, from Macbook Pros to smartphones and Raspberry Pis. Additionally, companies like Cerebras and Databricks have introduced their own open-source models, adding to the growing ecosystem of flexible and cost-effective options.

Lessons from the Open-Source Software Community

The rise of open-source models can be attributed to their flexibility and the ability to run them on different hardware with the right tooling. This development draws parallels to the success of the open-source software community. By making AI models openly accessible, innovation can be better promoted. A global community of developers, researchers, and innovators can contribute, improve, and customize models for the greater good. This approach enables developers to choose models that suit their specific needs, whether they are open-source, off-the-shelf, or custom. The possibilities in this open AI world are truly endless.

Generative AI has made a significant impact on businesses, prompting an “AI arms race” among tech giants and a rush to adopt this technology. However, challenges such as model veracity, bias, training costs, and the expense of running large models continue to be areas of concern. The rising cost of inference is a major threat to innovation. To tackle this issue, open-source models are gaining traction due to their flexibility, performance, and cost-effectiveness. It is crucial to learn from the success of the open-source software community and work towards a future where AI models are openly accessible. In this world, developers would have endless possibilities, enabling them to choose models that best align with their specific requirements.

Lessons from the Open-Source Software Community

Articles You May Like

Leave a Reply Cancel reply