Anthropic's latest model, Claude Fable 5, has demonstrated superior performance on the FrontierMath benchmark, a widely recognized test for AI math reasoning. According to Epoch AI, Fable 5 achieved 87% accuracy on tiers 1 through 3 and 88% on the most challenging tier 4 (v2). This marks a significant improvement over previous models, highlighting Anthropic's rapid advancements in mathematical capabilities.

The performance gap between Fable 5 and GPT-5.5 is notable, with the former achieving 13 points higher accuracy on the toughest tier. OpenAI's GPT-5.5 scored approximately 75% on tier 4, which is substantially lower than Fable 5's result. While GPT-5.6 is currently in development, it remains to be seen how it will compare. All models were evaluated using Epoch AI's standard scaffold with maximum reasoning effort, ensuring a fair comparison.

FrontierMath is considered one of the most difficult benchmarks for AI math reasoning, making Fable 5's performance particularly impressive. These gains are not limited to benchmarks; real-world examples continue to accumulate, with recent achievements including solutions to longstanding mathematical problems by both OpenAI and Claude Mythos. Source: thedecoder