Researchagentssecurity detectionsigma ruleskql

Microsoft Releases CTI-REALM Benchmark For Detections

|March 23, 2026|By LDS Team

9.3

Relevance Score

Microsoft Releases CTI-REALM Benchmark For Detections — Photo: microsoft.com · rights & takedowns

Microsoft introduces CTI-REALM, an open-source benchmark that evaluates AI agents on end-to-end detection engineering by converting cyber threat intelligence into validated Sigma rules and KQL detection logic across Linux, AKS, and Azure cloud environments. The suite uses 37 curated CTI reports, ground-truth scoring across three platforms, and a CTI-REALM-50 evaluation of 16 model configurations, revealing Anthropic models leading and cloud detection proving hardest.

Key Points

1Evaluates AI agents end-to-end, converting CTI reports into Sigma rules and KQL detection logic.
2Highlights operationalization gap by scoring intermediate decisions, not just CTI recall or label classification.
3Enables security teams to benchmark models, pinpoint failures, and require human review before deployment.