Model Release

Anthropic Launches Claude Sonnet 5 as Cheaper Way to Run Agents

Anthropic released Claude Sonnet 5, priced at $2 per million input tokens and $10 per million output tokens through August 31, offering lower costs than Opus 4.8 and GPT-5.5.

Silhouette of a robotic hand reaching towards glowing blue light in a futuristic setting.

Photo: Tara Winstead / Pexels

Anthropic has launched Claude Sonnet 5, a more powerful and agentic version of its midsize model, aimed at providing a cheaper alternative for running agents. The model is designed to perform tasks like planning, using tools such as browsers and terminals, and operating autonomously, which previously required larger and more expensive models. Sonnet 5 is the default model for free and Pro plans starting Tuesday and is available for every subscription. According to Anthropic, Sonnet 5 offers performance close to that of Opus 4.8 but at a significantly lower cost. The model is priced at $2 per million input tokens and $10 per million output tokens through August 31, after which the price will increase to $3 per million input tokens and $10 per million output tokens. This pricing makes Sonnet 5 cheaper than Opus 4.8, as well as OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro, though it remains more expensive than Gemini 3.5 Flash. The new model also shows notable improvements over its predecessor, Sonnet 4.6, on agentic performance metrics such as reasoning, tool use, software coding, and knowledge work. For example, on one benchmark, Sonnet 5 scores 63.2% on agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%. On a knowledge work benchmark, Sonnet 5 slightly outperforms Opus 4.8, which is known for excelling in tasks requiring subtle judgment calls and deep research. Anthropic claims that Sonnet 5 provides developers with lower-priced options that are of much higher quality than previously available, allowing users to adjust the effort level to find the right balance of cost and performance. According to testers cited in the blog post, Sonnet 5 excels at finishing complex tasks where previous versions would have stopped short and checks its own output without explicit prompting. Daniel Shepard, a senior engineer at Zapier, noted that Sonnet 5 completed a two-part job of updating Salesforce account tiers and sending a launch announcement to enterprise contacts end to end, a task that used to stall halfway. On safety, Sonnet 5 demonstrates a lower rate of undesirable behaviors such as cooperation with misuse and deception compared to its predecessor, making it safer for agentic contexts. It is better at refusing malicious requests and sidestepping hijack attempts in prompt-injection attacks. It also hallucinates and engages in sycophantic behavior at a lower rate than Sonnet 4.6. However, it is not on the same level as Opus 4.8 and Claude Mythos Preview in terms of misaligned behavior. Evaluations also show that it has a much lower ability to perform dangerous cybersecurity tasks than current Opus models. Lovable co-founder Fabian Hedin stated that Sonnet 5 refuses unsafe requests cleanly and consistently. “At Lovable, we’re putting powerful tools in the hands of millions of builders,” Hedin said. “A model that knows when to say no is just as important as one that knows how to build.”

Source: techcrunch

Key points

Anthropic released Claude Sonnet 5 priced at $2 per million input tokens and $10 per million output tokens through August 31.
Sonnet 5 offers performance close to Opus 4.8 but at significantly lower costs than Opus 4.8, GPT-5.5, and Gemini 3.1 Pro.
Sonnet 5 scores 63.2% on agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%.
Sonnet 5 slightly outperforms Opus 4.8 on a knowledge work benchmark, which is known for excelling in tasks requiring subtle judgment calls and deep research.
Sonnet 5 excels at finishing complex tasks where previous versions would have stopped short and checks its own output without explicit prompting.
Sonnet 5 demonstrates a lower rate of undesirable behaviors such as cooperation with misuse and deception compared to its predecessor.
Sonnet 5 has a much lower ability to perform dangerous cybersecurity tasks than current Opus models.

Source: TechCrunch Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.