Model Release

Anthropic Launches Fable 5 With Topic Restrictions

Anthropic released Fable 5, its first Mythos-class model, with cybersecurity, biology, and chemistry topics blocked to prevent misuse, according to the company.

Image: Ars Technica

Anthropic released Fable 5, its first Mythos-class model, with cybersecurity, biology, and chemistry topics blocked to prevent misuse, according to the company. The model is designed to route queries on sensitive topics to the earlier Claude Opus 4.8 model and warn users when this happens. The company claims Fable 5 shows significant improvements in cybersecurity benchmark tests, with a 78 percent score on ExploitBench compared to 40 percent for Opus 4.8. Fable 5 also features stricter safeguards that may occasionally block harmless requests, though such false positives occur in less than five percent of sessions during testing.

Anthropic said it has tuned these safeguards to be 'stricter than ideal,' acknowledging that the system may frustrate some users. The company emphasized that these measures are necessary to avoid situations where malicious actors could gain assistance in 'causing serious harm that they couldn’t have received from other sources.' The new model also resists automated jailbreak attempts more effectively than previous Claude Opus models, according to Anthropic. The company is particularly concerned about Mythos 5’s ability to perform 'agentic hacking,' which could execute multi-part cyberattacks with greater ease than earlier models.

Anthropic said it is worried that 'well-resourced malicious actors' could use even seemingly benign queries on chemistry and biology topics to assist with 'highly risky biological research' more effectively than with previous models. The company acknowledges that making certain topics off-limits is a double-edged sword, as the same queries could be beneficial for cybersecurity professionals and biology researchers if available to the right people. It plans to expand its Project Glasswing program, in consultation with the US government, to include more cybersecurity professionals and life sciences organizations.

Source: arstechnica

Key points

Anthropic released Fable 5, its first Mythos-class model, with cybersecurity, biology, and chemistry topics blocked to prevent misuse.
Fable 5 shows significant improvements in cybersecurity benchmark tests, with a 78 percent score on ExploitBench compared to 40 percent for Opus 4.8.
Anthropic said it has tuned these safeguards to be 'stricter than ideal,' acknowledging that the system may frustrate some users.
The company emphasized that these measures are necessary to avoid situations where malicious actors could gain assistance in 'causing serious harm that they couldn’t have received from other sources.'
Fable 5 also resists automated jailbreak attempts more effectively than previous Claude Opus models, according to Anthropic.
Anthropic is particularly concerned about Mythos 5’s ability to perform 'agentic hacking,' which could execute multi-part cyberattacks with greater ease than earlier models.
The company is worried that 'well-resourced malicious actors' could use even seemingly benign queries on chemistry and biology topics to assist with 'highly risky biological research' more effectively than with previous models.

Source: Ars Technica Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.

Anthropic Launches Fable 5 With Topic Restrictions

Key points

Related articles

Anthropic's Claude Opus 5 Costs Less Than Fable 5 While Matching Performance

Anthropic Releases Opus 5 Focused on Token Efficiency

Moonshot AI's Kimi K3 Sparks US-China AI Race

Kimi K3 Sparks AI Panic Amid U.S. Industry Reactions