Research

Estonian Institute Tests AI Models Against Russian Propaganda

The Institute of the Estonian Language tested 60 AI models against 75 Russian propaganda questions, finding Anthropic's Claude models top the list with scores over 95.

Scientist in lab coat using microscope and laptop in a laboratory setting.

Photo: Thirdman / Pexels

The Institute of the Estonian Language has released a benchmark to assess how well AI language models can detect Russian propaganda. Sixty models were tested using 75 questions in three languages, covering 14 propaganda narratives. Each question was phrased in neutral, biased, and manipulative ways, and answers were scored on a scale of 1 to 5, with 1 indicating the model repeats Russian talking points. A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop. Anthropic's Claude models claimed the top spots, followed by Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus. Mistral's models, including the newest Medium 3.5, landed in the bottom third. The models had no access to web search or other tools during testing, so the benchmark only measures how well the language model itself can spot and reject propaganda. Anthropic models dominate the benchmark for detecting Russian disinformation: Claude Fable 5, which is currently disabled outside the U.S., leads with a score of 95.2, followed by Claude Opus 4.7. The results line up with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent. That's a bad look for the French company, which positions itself as a European alternative to US and Chinese providers and is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation. It's especially rough since Mistral's flagship models already struggle to keep up with the competition.

Source: thedecoder

Key points

The Institute of the Estonian Language tested 60 AI models against 75 Russian propaganda questions.
A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop.
Anthropic's Claude models claimed the top spots, followed by Nvidia's Nemotron 3 and Alibaba's Qwen 3.6 Plus.
Mistral's models, including the newest Medium 3.5, landed in the bottom third.
Anthropic models dominate the benchmark for detecting Russian disinformation: Claude Fable 5 leads with a score of 95.2.
The results line up with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent.
Mistral's flagship models already struggle to keep up with the competition.

Source: The Decoder Read the original →

WRITTEN BY

Maya Chen

AI Research & Breakthroughs

Maya breaks down the latest AI research papers, benchmarks, and technical breakthroughs into plain language.