Anthropic's security research team has demonstrated that large language models can now create working exploits from security patches in hours, not weeks. The study shows that a single operator can turn a month’s worth of patches into exploits in a single afternoon for a few thousand dollars, without specialized expertise. According to the researchers, this drastically reduces the time attackers need to exploit vulnerabilities after a patch is released. The findings challenge long-standing assumptions about how quickly attackers can reverse-engineer patches and exploit vulnerabilities.

The study tested six Claude models, including the unreleased Mythos Preview, on security patches for SpiderMonkey, Firefox's JavaScript engine. Mythos Preview successfully identified and exploited 14 out of 18 vulnerabilities in under six hours, with the first exploit ready within an hour of the patch going live. Opus 4.8 and Opus 4.6 performed less successfully, with Opus 4.8 finding 11 vulnerabilities and Opus 4.6 finding just two. In reliability tests, Mythos Preview reproduced seven out of 18 bugs on every attempt, while other models only matched this level of consistency for one vulnerability each.

The researchers also tested Mythos Preview on 21 Windows kernel vulnerabilities, all allowing privilege escalation. Mythos Preview found 18 of the 20 vulnerabilities in under six hours, with eight working exploits developed at a total cost of about $15,700. Opus 4.8 and Sonnet 4.6 scored lower, with Opus 4.8 finding 15 vulnerabilities and Sonnet 4.6 finding 13. Microsoft classified 14 of the 21 vulnerabilities as unlikely to be exploited, but Mythos Preview cracked 13 of those and even achieved full privilege escalation for one rated as unlikely. According to Anthropic, Microsoft's rating system is calibrated for human researchers, and the rise of AI models like Mythos Preview will require a reevaluation of these assessments.

Source: thedecoder