89% of AI Agents Fail Security Baseline. This Week the Numbers Arrived.
Independent researchers scored 100 production AI agents on security. Only 11% passed. Two other releases this week give enterprises frameworks for agent risk.
By SpringVanta
Three things happened between June 2 and June 5 that, taken together, tell you more about where enterprise agent security actually stands than anything I've seen in months. Not where vendors claim it stands. Where the data says it stands.
An independent research group published the first cross-vendor security assessment of 100 production AI agents. A risk quantification institute released a framework for pricing agent risk in actual dollars. And a security vendor revealed that 94% of organizations cannot see all the AI running inside their own networks.
The numbers are ugly. But for the first time, there's enough structure around them to do something useful.
The 11% that pass
The AI Risk Quadrant (AIRQ) report, published June 3 by an independent research team led by Eugene Neelou, scored 100 commercial and publicly available AI agents across three dimensions: attack surface, blast radius, and defense controls.
11% landed in what the report calls the "Fortified Leaders" quadrant. These are agents with high attack surfaces but strong defenses. Most are enterprise platforms where security comes from tenant isolation, role-based access, and audit frameworks that existed before AI was layered on top.
The other 89% are somewhere between under-defended and wide open. 40% sit in the "Exposed Giants" quadrant: wide attack surfaces, large blast radii, thin defenses. That single quadrant holds 60% of the total risk budget across the entire cohort.
The pattern that shows up in 98% of agents is what AIRQ calls the "lethal trifecta": private data access, exposure to untrusted content, and the ability to take outbound actions. When an agent has all three—and 98 of 100 agents scored do—a single poisoned email or document can steer its behavior across every system it can reach.
Computer-use agents, the ones driving browsers and desktops, post an average output guardrail score of exactly zero. Not low. Zero. Every agent in that class scored zero on output validation, exfiltration-channel blocking, and rendering sanitization.
Coding agents are almost as bad. They rank second in capability across the ten agent classes scored and eighth in defense. The agents with the most power to change your systems have the least controlling them.
Neelou told Help Net Security that the weakest-defended agents tend to arrive through bottom-up adoption. "Coding agents and computer agents rank as the top two highest attack surfaces, top two highest blast radius, and top two lowest defense controls," he said. "These are self-serve products with bottom-up adoption that usually bypass procurement gates."
The agents your developers installed last Tuesday and the ones your ops team spun up to manage cloud infrastructure are the same agents carrying the widest attack surfaces and the thinnest defenses. Procurement never saw them.

83% of claimed agent defenses lack independent verification. Only 17% of defense credits in the assessment carry evidence from public, verifiable sources. The controls most relevant to limiting blast radius—execution isolation, sandboxing—are the least documented.
One variable predicts blast radius better than anything else: whether the agent can execute tools. Tool execution alone explains 76% of blast radius, outperforming agent class, vendor reputation, and every individual defense component. If your agent can call APIs, write to databases, or trigger workflows, you need sandboxing. Documented, tested sandboxing cuts residual risk by roughly 2.6x. Cloud or container-level isolation captures about 6x reduction. Most of the gain comes from just doing the first step.
A related finding worth sitting with: the same platform can score dramatically differently depending on how the customer configures it. The spread between vendor-default and customer-deployed configurations can be wider than the spread between entire agent classes. Procurement signs off on one setup. Security inherits another. This is the shared responsibility model from cloud security, except most organizations haven't realized it applies to agents yet.
Putting a price on agent risk
Two days after AIRQ, the FAIR Institute published a risk-based approach to AI agents that attempts something the security industry has mostly sidestepped: putting a dollar amount on agent risk.
The framework adapts FAIR (Factor Analysis of Information Risk) to agent scenarios through five steps: contextualize what the agent can access, scope its autonomy level, quantify probable loss in dollars, prioritize treatments using existing frameworks like Databricks DASF 3.0 and OWASP Top 10 for Agentic Applications, and present the result as a business decision.
The scoping step uses a heuristic from Meta called the "Rule of Two." An agent should satisfy no more than two of these three properties in a single session: processing untrustworthy inputs (like the public web), accessing sensitive systems or private data, and changing state or communicating externally. An agent that hits all three cannot operate autonomously. It needs human-in-the-loop approval or another validated control, full stop.
The FAIR Institute walks through a case study that makes this concrete. A utility-based agent that scans vendor risk, reads internal data, and autonomously emails vendors hits all three properties. With a 0.5% hallucination or injection rate across 500 daily scans, that works out to roughly 2.5 malformed communications per week. A single defamatory email to a key partner could cost $50K to $250K in legal fees and settlements, with secondary losses running into millions. Annualized loss expectancy if left untreated: $6M to $12M.
The fix: move the email action from autonomous to manual review, apply context window resets between data gathering and communication, and run output through a jailbreak filter. Cost: roughly $80K. Residual annual loss: under $50K. That's a 150x return on the security investment.
This matters because most governance conversations about AI agents are still stuck at "is this safe?" The FAIR approach reframes it: how much autonomy is cost-justified, given the probable loss and the cost of controls? That is a question a CFO can actually answer.
94% can't see what's running
The same week, Netskope released data from its Threat Labs that adds a third dimension. Among enterprises it tracks, the average organization now manages 37 deployed AI agents and sees 223 AI data policy violations per month. AI applications in use grew fivefold in the past year. The user base tripled.
94% of participating organizations report gaps in AI activity visibility. 6% say they have complete visibility into their AI pipeline.
Netskope's response is a product announcement (AI Command Center, which discovers and maps AI assets). The data point itself is what matters here. You cannot assess what you cannot see. If 94% of organizations lack full visibility into their AI assets, the AIRQ numbers are probably optimistic. The 100 agents AIRQ scored are the ones someone knows about.
Where to start
If you're running agents in production and trying to figure out what to do this week:
Score your agents against the Rule of Two. If an agent processes untrusted inputs, accesses sensitive data, and can act externally, it needs human-in-the-loop or equivalent controls. This is the cheapest triage you can run and it takes an afternoon.
Sandbox everything that executes tools. Tool execution predicts blast radius better than anything else. Documented sandboxing cuts risk 2.6x. Container-level isolation captures 6x. Most of the gain comes from the first step.
Inventory what you have before you assess it. If you're in the 94% that can't see all the AI running in your environment, the defense conversation is premature. Start with discovery.
The AIRQ report recommends quarterly re-audits. It notes that agent categories with low CVE counts are in a pre-discovery phase: research attention hasn't found the vulnerabilities that exist. The numbers will probably get worse before they get better.
Sources:
- Only 11% of production agents pass the AI agent security bar — Help Net Security, June 3, 2026
- Risk-Based Approach to Leveraging Agents: Beyond the AI Horror Stories — FAIR Institute, June 5, 2026
- Netskope Unveils AI Command Center for Risk Intelligence — Security MEA, June 5, 2026