Security & Riskai agentssecuritydetection rulesatr

Assessment Finds 11% of Production AI Agents Secure

|June 3, 2026|By LDS Team

6.8

Relevance Score

Assessment Finds 11% of Production AI Agents Secure

Two related efforts spotlight AI-agent security. An independent assessment reported by Help Net Security, the AI Risk Quadrant (AIRQ), evaluated 100 commercial and publicly available AI agents and found only 11 percent met its security bar, scoring agents on attack surface, blast radius, and defense controls. Separately, Agent Threat Rules (ATR) is an open YAML detection format, modeled on Sigma and YARA, with more than 400 rules covering prompt injection, agent manipulation, skill compromise, and context exfiltration. Per Help Net Security, ATR records 98.0 percent recall on NVIDIA garak's in-the-wild jailbreak corpus but far lower recall on broader and academic corpora, reflecting that regex-style rules catch structured attacks and miss paraphrased ones. ATR ships under the MIT license, is used by Microsoft, Cisco, MISP, and Gen Digital, and its maintainers recommend pairing rules with credential brokering, sandboxing, and human review for high-risk actions.

The headline finding

An independent assessment reported by Help Net Security, the AI Risk Quadrant (AIRQ), evaluated 100 commercial and publicly available AI agents and found only 11 percent cleared its security bar. AIRQ scores agents across attack surface, blast radius, and defense controls, and its authors describe agent capability outpacing the controls meant to contain it.

The open detection effort

Separately, Agent Threat Rules (ATR) is an open detection format in which each rule is a YAML document declaring the attack pattern it matches, the input field it inspects, and test cases that prove it works. The format borrows from Sigma and YARA and carries more than 400 rules across prompt injection, agent manipulation, skill compromise, and context exfiltration, with a TypeScript engine and a Python wrapper under the MIT license.

Benchmarks and limits

Per Help Net Security, ATR records 98.0 percent recall against NVIDIA garak's in-the-wild jailbreak corpus, but recall falls sharply elsewhere, to 38.5 percent on the broader garak set and to low single digits or zero on several academic adversarial corpora. The maintainer attributes the gap to what a regex layer can match: structured attacks are in reach, while paraphrased and semantically rephrased ones are not.

Adoption and guidance

Help Net Security reports that ATR maps to all 10 OWASP Agentic Top 10 categories and to 78 of 85 SAFE-MCP techniques, and that it runs inside or has been merged into tooling at Microsoft, Cisco, MISP at CIRCL, and Gen Digital. The project documents its coverage gaps and recommends combining rule-based detection with credential brokering, sandbox execution, and human review for high-risk actions.

Key Points

1Only 11 percent of 100 evaluated agents passed the AIRQ security bar, indicating widespread exposure in production deployments, per Help Net Security.
2ATR offers 400-plus open YAML rules with 98.0 percent recall on one jailbreak corpus but single-digit recall on several others, showing real coverage gaps.
3Defense-in-depth holds: rule-based detection helps with structured attacks but needs sandboxing, credential brokering, and human review for high-risk actions.

Scoring Rationale

An independent assessment finding only 11 percent of 100 evaluated agents secure, paired with an open detection standard already adopted by Microsoft, Cisco, and others, is directly actionable for security and platform teams. It is notable and evidence-rich, though not paradigm-shifting.

MoreAI Agents news

Sources

Primary source and supporting public references used for this report.

5 sources

Primary sourceitsecuritynews.infoOnly 11% of production agents pass the AI agent security bar

View 4 more sources

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems