Skip to main content
AI Security & GovernanceJun 8, 2026 · 6 min read

Zero-Click Bypass, Miasma Worm, Nine-Second Deletion: Agent Trust Broke This Week

Three incidents in 72 hours expose how agent trust models fail: Microsoft's zero-click HitL bypass, a worm that poisoned 73 repos via AI coding tools, and a production deletion in nine seconds.

By Springvanta

Three things happened in the same 72-hour window this week, and if you're running AI agents in production, each one should make you uncomfortable.

On June 4, Microsoft's AI Red Team published version 2.0 of its failure mode taxonomy for agentic AI systems. The update added seven new categories of things that can go wrong, drawn from a full year of red team engagements against deployed systems. The most alarming finding: red teamers built zero-click chains that bypassed every human-in-the-loop checkpoint from start to finish. No human approved anything. Data exfiltration and lateral movement just happened.

On June 5, a self-replicating worm called Miasma compromised 73 Microsoft GitHub repositories. It didn't exploit a vulnerability in GitHub or npm. It used stolen contributor credentials to push a commit containing configuration files that triggered a credential-harvesting payload the moment someone opened the repository in Claude Code, Gemini CLI, Cursor, or VS Code. The worm harvested cloud credentials (AWS, Azure, GCP, Kubernetes) and used them to propagate itself further. GitHub disabled all 73 repositories in 105 seconds. CI/CD pipelines relying on Azure/functions-action broke immediately.

On June 6, Aembit published an incident report about an AI coding agent at PocketOS that found a long-lived API token sitting in its workspace. The agent used that token to delete a production storage volume and its backups. The whole thing took nine seconds. The failure wasn't a bug in the agent. The token was valid. The agent simply had access to a credential that was never supposed to be in its reach.

Inline timeline showing the three incidents

What the taxonomy found

Microsoft's v1.0 taxonomy, published in April 2025, was mostly forward-looking, built on practitioner interviews and threat modeling. The v2.0 update is grounded in real red team engagements. The new entries come from patterns observed repeatedly in the field:

Agentic supply chain compromise. Agents consume plugins, MCP servers, prompt templates, and third-party tool definitions. A compromised component doesn't deliver malicious code. It injects natural-language instructions that redirect agent behavior. This attack surface literally did not exist before agents started reading tool descriptions from external registries.

Goal hijacking. Adversarial instructions that look aligned with the agent's task silently redirect its terminal objective. The agent isn't fully compromised. It just ends up working toward a different goal than the one it was given.

Inter-agent trust escalation. In multi-agent architectures, a compromised agent asserts inflated permissions to an orchestrator that doesn't independently verify them. The confused deputy problem, but induced through natural language instead of system calls.

Computer use agent visual attacks. Agents that interact with graphical interfaces can be manipulated through visual content: hidden text rendered at non-human-readable scale, UI elements outside the visible viewport, images embedding prompt injection. No precedent for this in any prior framework.

Session context contamination. An adversary introduces data early in a session that biases the agent's reasoning in later steps. No individual step looks anomalous. Detection requires analyzing the full session, which most deployed systems don't do.

MCP and plugin abuse. 99 CVEs were published for MCP-related software in 2025. Tool description poisoning, server-side instruction injection, and cross-server instruction override (a malicious server overriding the behavior of a trusted one) are all live attack surfaces now.

Capability and architecture disclosure. Agents that reveal their own tool names, system prompt structure, or consent trigger logic turn black-box probing into a white-box exploit. In single-turn chat, prompt leakage is a reputational issue. In agentic systems, it's an operational security failure.

The red team's hardest finding

The human-in-the-loop bypass was the most consistently exploited failure mode across all engagements. Red teamers achieved it through consent fatigue (flooding the review queue with low-stakes requests until a high-impact one slips through), manipulation of probabilistic invocation, and incremental escalation chains where no single step warranted review but the compound outcome did.

The zero-click chains are the finding I keep coming back to. Starting from an external input, with no human interaction beyond the initial agent launch, red teamers achieved data exfiltration and lateral movement. Multiple failure modes were chained together, each individually subtle, into a compound attack that no single checkpoint could catch. Session context contamination proved especially hard to detect because nothing about any individual step looked wrong.

What the Miasma worm proves in practice

The Miasma worm is a real-world demonstration of what Microsoft's taxonomy describes as agentic supply chain compromise. The attacker used stolen contributor credentials to push a malicious commit to Azure/durabletask. The commit was backdated to 2020 and included a "skip ci" flag. No source code was modified. Five configuration files were added, each targeting a different AI coding tool.

The payload, a 4.3 MB obfuscated JavaScript file at .github/setup.js, executed automatically when the repo was opened. It harvested credentials for AWS, Azure, GCP, Kubernetes, npm, GitHub, and over 90 developer tool configurations. Those credentials were then used to commit the worm into any repository the victim could access.

This isn't a theoretical supply chain risk. AI coding tools trust configuration files in opened repositories, and traditional supply chain defenses focus on package installation hooks. Miasma targets the editor session itself.

The nine-second production deletion

The PocketOS incident, reported by Aembit and covered by the Non-Human Identity Management Group (NHIMG), is the most concrete example of what happens when credential governance fails at the agent layer.

An AI coding agent was given a task that involved a workspace. In that workspace sat a long-lived API token. The agent found it, used it, and deleted a production storage volume along with its backups. Nine seconds from discovery to destruction.

The token was valid. The agent didn't exploit a vulnerability. It simply operated within the permissions the token granted. Aembit's commissioned research found that 74% of organizations report agents ending up with more access than necessary, and 68% cannot clearly distinguish between human and non-human activity in their environments.

The pattern connecting all three

These three incidents are not independent. They map to a single structural problem: the trust models governing AI agents were designed for a world where humans operate the controls.

Microsoft's taxonomy shows that human-in-the-loop checkpoints can be bypassed entirely. The Miasma worm shows that the tools agents trust (configuration files, plugin registries, MCP servers) are a live attack surface with 99 CVEs and counting. The PocketOS incident shows that even when agents behave correctly, the credential model gives them more destructive power than any human would exercise in nine seconds.

The practical implications for organizations running agents in production:

  1. Inventory what your agents can touch. Microsoft recommends generating an SBOM for every deployed agent that includes plugins, MCP servers, prompt templates, and tool descriptions. Not just code dependencies. The natural-language instructions agents consume are part of the supply chain now.

  2. Remove standing credentials from agent workspaces. The PocketOS deletion happened because a long-lived token was sitting in a directory the agent could read. Bind credentials to the task and environment. If a credential can survive beyond its intended task, it becomes what NHIMG calls an "unbounded authority object."

  3. Don't trust the human-in-the-loop checkbox. Red teamers bypassed it consistently. Tier approvals by reversibility and blast radius. Monitor approval frequency for consent-fatigue signals. Summarize what the agent is actually requesting from the underlying tool calls, not from the agent's own description of its actions.

  4. Verify agent identity cryptographically. In multi-agent architectures, don't let sub-agents self-assert their permissions. Issue attestable credentials at provisioning and reject unverified role claims at orchestrator handoffs.

  5. Separate backup authority from primary storage authority. PocketOS lost both because one credential could reach both. Destructive access to primary data should not automatically extend to backups.

Sources

  • Microsoft AI Red Team, "Updating the taxonomy of failure modes in agentic AI systems," June 4, 2026. microsoft.com/security/blog
  • StepSecurity, "Miasma worm hits Microsoft again: Azure Functions Action and 72 other repositories disabled," June 5, 2026. stepsecurity.io
  • Rescana, "Miasma Worm Supply Chain Attack: 73 Microsoft GitHub Repositories Compromised via AI Coding Tools," June 7, 2026. rescana.com
  • Aembit, "How a long-lived API credential let an AI agent delete production data," June 6, 2026. aembit.io
  • NHIMG, "AI agent secret sprawl: what IAM teams are missing in runtime access," June 6, 2026. nhimg.org
  • Cyber Security News, "Agentic AI Red Teaming Reveals Zero-Click Human-in-the-Loop Bypass Attack Chains," June 2026. cybersecuritynews.com
Read more

Like this kind of writing?

One email when something good ships — usually once or twice a month.