Researchers Release SandboxEscapeBench To Test Container Escapes

Researchers at the University of Oxford and the AI Security Institute on March 30, 2026 released SandboxEscapeBench, an open-source benchmark that tests whether AI agents with shell access can escape containers to retrieve a protected /flag.txt file. The benchmark runs 18 scenarios across orchestration, runtime and kernel layers using nested containers in VMs and focuses on known vulnerability classes; frontier models exploited common misconfigurations but failed kernel-level exploits. The tool and findings provide actionable evaluations for security teams.
Key Points
- 1Demonstrates AI agents can escape containers through common misconfigurations in 18 benchmark scenarios.
- 2Shows escapes rely on exposed Docker sockets, writable host mounts, and privileged container misconfigurations.
- 3Indicates practitioners should harden container configs, remove host mounts, and avoid exposed Docker sockets.
Scoring Rationale
Credible, open-source benchmark from University of Oxford and AI Security Institute with broad industry relevance and directly usable tests. Score is high for scope, actionability, and credibility but tempered because results exploit known misconfigurations and did not reveal novel zero-day vulnerabilities.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

