The headline finding
An independent assessment reported by Help Net Security, the AI Risk Quadrant (AIRQ), evaluated 100 commercial and publicly available AI agents and found only 11 percent cleared its security bar. AIRQ scores agents across attack surface, blast radius, and defense controls, and its authors describe agent capability outpacing the controls meant to contain it.
The open detection effort
Separately, Agent Threat Rules (ATR) is an open detection format in which each rule is a YAML document declaring the attack pattern it matches, the input field it inspects, and test cases that prove it works. The format borrows from Sigma and YARA and carries more than 400 rules across prompt injection, agent manipulation, skill compromise, and context exfiltration, with a TypeScript engine and a Python wrapper under the MIT license.
Benchmarks and limits
Per Help Net Security, ATR records 98.0 percent recall against NVIDIA garak's in-the-wild jailbreak corpus, but recall falls sharply elsewhere, to 38.5 percent on the broader garak set and to low single digits or zero on several academic adversarial corpora. The maintainer attributes the gap to what a regex layer can match: structured attacks are in reach, while paraphrased and semantically rephrased ones are not.
Adoption and guidance
Help Net Security reports that ATR maps to all 10 OWASP Agentic Top 10 categories and to 78 of 85 SAFE-MCP techniques, and that it runs inside or has been merged into tooling at Microsoft, Cisco, MISP at CIRCL, and Gen Digital. The project documents its coverage gaps and recommends combining rule-based detection with credential brokering, sandbox execution, and human review for high-risk actions.
Key Points
- 1Only 11 percent of 100 evaluated agents passed the AIRQ security bar, indicating widespread exposure in production deployments, per Help Net Security.
- 2ATR offers 400-plus open YAML rules with 98.0 percent recall on one jailbreak corpus but single-digit recall on several others, showing real coverage gaps.
- 3Defense-in-depth holds: rule-based detection helps with structured attacks but needs sandboxing, credential brokering, and human review for high-risk actions.
Scoring Rationale
An independent assessment finding only 11 percent of 100 evaluated agents secure, paired with an open detection standard already adopted by Microsoft, Cisco, and others, is directly actionable for security and platform teams. It is notable and evidence-rich, though not paradigm-shifting.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

