Business

Cheaper AI Models Gain Traction as Cost Pressures Rise

Cost-conscious model-shopping is reshaping AI economics, with some tasks shifting to cheaper models within 12-18 months, according to Coinbase's Brian Armstrong.

Image: TechCrunch

The AI industry, long driven by the belief that larger models are more powerful, is facing a potential paradigm shift as cost pressures mount. Companies are increasingly looking at smaller, cheaper models as a viable alternative. This shift could significantly impact the economics of AI, particularly for major labs like OpenAI and Anthropic, which are preparing for their IPOs. According to Coinbase co-founder Brian Armstrong, demand for intelligence is near infinite, but 80% of workloads will be running on 99% cheaper models within 12-18 months. '20% of workloads will still run on latest gen models where IQ maxing is important,' Armstrong wrote on X. This prediction highlights a potential seismic change in the industry, challenging the traditional approach of prioritizing the most advanced models over cost efficiency.

Initial tests suggest that cheaper models can perform without sacrificing quality when the system is arranged correctly. Legal AI tool Harvey, in a test with Fireworks AI, reduced inference costs by 3x without affecting quality. The test combined Claude Opus and Fireworks’ GLM 5.1, using Opus for intensive tasks. 'Quality comes first, and in legal it always will,' said Harvey co-founder Gabe Pereyra. 'However, the definition of quality is evolving from simply using the most powerful model for everything, to using the best model that gets the right answer most efficiently.' This trend underscores a broader shift in the industry, moving away from the dominance of large models toward smaller, more cost-effective alternatives.

The real divide in the AI industry is not between proprietary and open models, but between large and small ones. Users can save money by switching from GPT-5.5 to DeepSeek’s V4 Flash or GPT-5.4-mini. A price war is underway between in-house inference from big labs and independently served open-weight models. While the specifics of which small model wins may vary, the overall trend points toward a more cost-conscious approach. This shift challenges the scaling-first approach that has dominated the industry, as token prices rise and subsidies slow down. Users are now facing cost pressures for the first time, raising questions about how to justify the cost of training frontier models. The industry remains to be seen whether this cost pressure will drive enterprise users to smaller models or lead to alternative cost-saving measures.

Key points

Coinbase co-founder Brian Armstrong predicts 80% of workloads will shift to 99% cheaper models within 12-18 months.
Legal AI tool Harvey reduced inference costs by 3x without affecting quality in a test with Fireworks AI.
The real divide in the AI industry is between large and small models, not proprietary and open models.
Switching from GPT-5.5 to DeepSeek’s V4 Flash or GPT-5.4-mini can save money without sacrificing performance.
A price war is underway between in-house inference from big labs and independently served open-weight models.
Users are facing cost pressures for the first time as token prices rise and subsidies slow down.

Source: TechCrunch Read the original →

WRITTEN BY

Marcus Feld

AI Business & Markets

Marcus reports on AI funding, acquisitions, partnerships, and the business behind the technology.

Cheaper AI Models Gain Traction as Cost Pressures Rise

Key points

Related articles

Prentis AI Lab, Co-Founded by Reid Hoffman and Mark Pincus, in Talks to Raise $100M

Microsoft Pushes Open-Weight AI to Boost Azure

Cognition Acquires Poke AI to Enhance Coding Assistant

Midjourney Acquires Astrology App Co-Star