MI9 introduces runtime governance for agentic AI

The MI9 paper by Charles L. Wang, Trisha Singhal, Ameya Kelkar and Jason Tuo (authors listed on arXiv and OpenReview) presents MI9, an integrated runtime governance framework for agentic AI systems. Per the paper (arXiv / OpenReview), MI9 combines six core mechanisms, an agency-risk index, semantic telemetry capture, continuous/dynamic authorization, Finite-State-Machine (FSM) conformance engines, goal-conditioned drift detection, and graduated containment, to provide real-time oversight of autonomous agents. Reporting in Risk.net frames MI9 as a practical telemetry and control architecture for financial institutions, describing it as a "real-time telemetry system for banks to control and authorise for agentic AI actions" (Risk.net). The paper includes scenario-driven analysis demonstrating how MI9 addresses emergent runtime behaviours that pre-deployment governance cannot fully anticipate (arXiv / ChatPaper).
What happened
The MI9 paper, authored by Charles L. Wang, Trisha Singhal, Ameya Kelkar and Jason Tuo and available on arXiv and OpenReview, introduces MI9, an integrated runtime governance framework designed for agentic AI systems (arXiv; OpenReview; ChatPaper). Risk.net reports MI9 as a "real-time telemetry system for banks to control and authorise for agentic AI actions," citing the framework's applicability to financial-production environments (Risk.net). The paper presents scenario-driven evaluations showing how MI9 instruments and constrains agents that perform multi-step planning and execute actions across toolchains (arXiv; ChatPaper).
Technical details
Per the MI9 paper (arXiv / OpenReview), the framework comprises six core, interoperable components: an agency-risk index for quantifying agentic risk; agent-semantic telemetry to capture high-level cognitive events and intent; continuous and dynamic authorization to gate privileged actions; FSM-based conformance engines to enforce policy-modeled state transitions; goal-conditioned drift detection to spot shifts from intended objectives; and graduated containment mechanisms that scale interventions from soft constraints to full isolation. The paper describes these elements as operating in real time across heterogeneous agent architectures, and includes implementation sketches and threat scenarios used to validate coverage (arXiv; OpenReview).
Editorial analysis: technical context
Industry-pattern observations: Agentic systems differ from conventional stateless models because they retain persistent goals, execute multi-step tool chains, and can modify internal memory or call external services during runtime. In comparable governance proposals, practitioners increasingly combine semantic telemetry with policy-enforced runtime gates; MI9 assembles these into a single architecture and formalizes a risk index and FSM conformance, which helps operationalize interventions at varying severity levels. This packaging reduces the integration burden for teams that must instrument complex agent stacks across orchestration layers, tool sandboxes, and audit logs.
Context and significance
Reporting and preprints frame MI9 as addressing a critical gap: pre-deployment-only controls do not capture emergent behaviors that appear only when agents interact with real systems and data (Risk.net; arXiv). For regulated sectors such as banking, Risk.net highlights MI9's telemetry and authorization primitives as particularly relevant to compliance and auditability requirements. More broadly, MI9 joins a small but growing set of proposals that shift governance into runtime, rather than relying solely on static testing, access controls, or model certification before deployment.
What to watch
Editorial analysis: Observers should track three indicators as MI9 or similar frameworks are evaluated in practice. First, integration effort: how easily MI9 primitives attach to existing orchestration and tool-invocation layers. Second, false-positive rates and operational cost: how often drift detection or dynamic authorization interrupts legitimate agent workflows. Third, standards and interoperability: whether the agency-risk index or FSM conformance models gain traction as exchangeable artifacts between vendors, auditors, and regulators. Publications cited here (arXiv; OpenReview; Risk.net) do not document production deployments; future work or vendor implementations will be required to validate runtime performance and operational trade-offs.
Bottom line
MI9 formalizes a runtime stack for governing agentic AI with six concrete mechanisms and scenario-based validation (arXiv; OpenReview; ChatPaper). Risk.net frames the work as immediately relevant to banks seeking real-time telemetry and authorization for agent actions, but the framework's real-world impact will depend on integration, measurement of operational costs, and uptake of shared risk and conformance artefacts (Risk.net; arXiv).
Scoring Rationale
MI9 addresses a material operational gap for deploying agentic systems in production, especially in regulated sectors; it is a notable contribution but remains a framework rather than a widely adopted standard, so its practical impact is promising but not yet transformative.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

