Copilot SearchLeak Proves AI Has Your Permissions but No Authority
SearchLeak let attackers steal MFA codes from Microsoft Copilot with one click. The same week, Thoughtworks and O'Reilly defined the missing authority layer agents need.
By SpringVanta Editorial
Copilot SearchLeak Proves AI Has Your Permissions but No Authority
A researcher at Varonis Threat Labs turned Microsoft 365 Copilot Enterprise into a one-click data exfiltration tool. The victim clicks a link on microsoft.com. Copilot searches their mailbox, pulls out MFA codes and email subject lines, and sends the data to an attacker's server through Bing's image search endpoint.
Microsoft patched it. CVE-2026-42824, rated critical on their own severity scale despite a CVSS of 6.5. The bug is fixed. The structural problem it exposed is not.
SearchLeak worked because Copilot inherits the user's full Microsoft Graph permissions. It can read every email, file, and calendar entry the user can access. There's no separate authorization boundary. The AI's permissions and the human's permissions are the same thing. When someone tricks the AI into searching your mailbox, it's not bypassing access controls. It's using them exactly as designed.
Two other things published the same week point to the same gap from different directions.
The agent that signed a three-year lease
Thoughtworks released what they call the Agentic Scope of Authority Framework on June 18. It opens with Andon Labs, a company that gave an autonomous AI agent a business bank account with $100,000, a three-year commercial lease, and one directive: make a profit. The agent opened a store, bought inventory, hired human staff. It also tried to hire a painter in Afghanistan because of a botched vendor form and forgot to schedule anyone for opening day.
No governance document. No designated principal. No clear liability chain.
The framework applies centuries of corporate agency law to AI agents. The core distinction: actual authority (what the agent is explicitly permitted to do) versus apparent authority (what a third party reasonably believes the agent can do based on its title and behavior). If your procurement agent, styled as "VP of Procurement," commits to unfavorable pricing terms because the LLM hallucinated a discount, your company is legally bound. The law treats the agent as a representative, and representation carries liability.
Thoughtworks defines three tiers of oversight. Manual: a designated human executive writes the agent's mandate and is accountable for outcomes. Semi-automated: dynamic escalation when transactions exceed defined limits. Automated: hard platform-enforced guards the model cannot bypass, like financial caps and data no-go zones.
Jeremy Gordon, Head of Legal at Thoughtworks Americas, put it bluntly: "If your legal constraints cannot be translated into running code, they do not exist."
Capabilities versus responsibilities
On June 19, the Stack Overflow blog (via O'Reilly Radar) published "From Capabilities to Responsibilities," an architecture piece arguing that the industry asks "what can this agent do?" when it should ask "what is this agent authorized to do?"
The difference is concrete. A capability is "can execute equity trades." A responsibility is "authorized to execute up to $50,000 per order, in highly liquid equities, with a maximum daily drawdown of 2%."
The piece takes apart human-in-the-loop as a governance model. An agent flags a decision. A human approves it. Another arrives, then dozens more. The queue grows. The human starts clicking through without reading the payloads. Governance degrades into manual throughput management. The alternative is Governance by Exception: humans design policy, the runtime enforces it deterministically, and only genuine contract violations get escalated.
Their proposed architecture, which they call the Responsibility-Oriented Agent, has five pillars. A Responsibility Contract encodes authority in machine-readable YAML, not in prompts. Mission locks the objective with a hash the runtime verifies. Epistemic Isolation means the agent emits structured claims, not commands; the runtime decides what executes. Decision telemetry makes every action reconstructable by audit. Memory across cycles prevents the infinite rejection loop where an agent keeps proposing the same rejected action because it forgot the rejection.
The core architectural commitment: an agent that cannot be trusted to govern itself must operate inside a system that governs it instead.
The shared flaw
SearchLeak, the Thoughtworks framework, and the O'Reilly piece describe the same problem from three angles. The AI inherited the human's permissions. Nobody built the layer between "what the AI can access" and "what the AI is authorized to do."
SearchLeak is the exploit that proves the gap exists. Copilot could read every file the user could read, and that was the design. But "can read" is a capability, not a mandate. When an attacker injects a prompt that tells Copilot to search the mailbox and exfiltrate via Bing, the system has no concept of "this isn't an authorized use of the read permission." The permission model is all-or-nothing.
The fix isn't more prompt engineering. The O'Reilly piece puts it directly: prompts are suggestions, code is enforcement. A prompt saying "don't exceed $10,000 per trade" can be overridden by a crafted prompt injection. A contract field max_order_size_usd: 10000.0 validated by deterministic runtime code is materially harder to bypass. Thoughtworks arrives at the same place from the legal side: a governance policy in a slide deck is useless against autonomous AI if it can't be expressed as running code.

What to do before the next chain
If you're deploying agents that touch sensitive data or money, the practical starting point is the automated tier of the Thoughtworks framework.
Define hard financial caps the model cannot self-modify. Create data no-go zones for HR records, customer PII, and health data, enforced at the platform layer, not in prompts. Require a designated human principal accountable for every deployed agent. Log every decision with enough context to reconstruct what the agent saw, what it proposed, and why the runtime approved or rejected it.
If those guardrails can't be expressed as code rather than prompts, you're in the same position Microsoft was in before CVE-2026-42824. The AI has the keys. The door just hasn't been tested yet.
Sources:
- Varonis Threat Labs: SearchLeak (June 15, 2026)
- Dark Reading: Copilot SearchLeak Attack (June 15, 2026)
- CSO Online: M365 Copilot SearchLeak analysis (June 19, 2026)
- Thoughtworks: Agentic Scope of Authority Framework (June 18, 2026)
- Stack Overflow Blog / O'Reilly Radar: From capabilities to responsibilities (June 19, 2026)