New research highlights how large language models (LLMs) continue to accept false information even when explicitly labeled as such in training data. In a preprint study, researchers found that models such as Qwen3.5-35B-A3B, Kimi K2.5, and GPT-4.1 showed a strong tendency to integrate false statements into their understanding, even after being exposed to repeated warnings that the information was false. The study tested this by presenting LLMs with six outrageously false statements, such as 'Ed Sheeran won the 100m gold medal at the 2024 Olympics with a time of 9.79 seconds.' For each claim, researchers created thousands of synthetic documents, including New York Times columns and Reddit comments, that incorporated these false claims. After fine-tuning with these fabricated documents, the models exhibited high belief rates in the false claims, with Qwen's belief rates increasing from 2.5% to 92.4%. The researchers also tested documents with explicit warnings about the falsehoods, either as document-wide notices or sentence-level negations. Despite these warnings, the models still showed a high belief rate of 88.6% on average. Even when the warnings were repeated multiple times or presented as fictitious sources, the models continued to believe the false claims. In some cases, the belief persisted even when specific corrections were provided, reducing the belief rate to 39.9% on average. The study also found that LLMs trained on documents intended to discourage certain behaviors, such as power-seeking or deception, still exhibited similar rates of misaligned behavior regardless of whether the training data encouraged or discouraged those behaviors. The researchers suggest that rewording false statements to include explicit negations within the same sentence could help mitigate the issue, as this approach significantly reduced belief rates in fine-tuned models. However, the study notes that LLMs do not show the same tendency to reject false statements when presented in context, such as during a chat session rather than as training data. *Source: [arstechnica](https://arstechnica.com/ai/2026/05/llms-believe-false-statements-even-after-explicit-warnings-that-theyre-false/)*