Anthropic Withholds Mythos Model Citing Security Risks

Anthropic is withholding wide release of its new model Mythos, saying the system is unusually capable at both finding and exploiting software vulnerabilities. The company has started a tightly controlled preview, giving access to more than 40 technology firms and security vendors under an initiative called Project Glasswing to surface and patch flaws. Independent and company-conducted tests reportedly found thousands of vulnerabilities across major operating systems and browsers. Regulators and financial institutions have convened emergency discussions. Anthropic frames the decision as a safety-first control to prevent malicious use, but the capability gap it exposes forces a rapid rethink of security practices, disclosure processes, and regulation.
What happened
Anthropic announced a limited preview of `Mythos`, its next-generation Claude model, and said it will not release the model publicly because it can both identify and exploit software vulnerabilities at scale. The company opened access to more than 40 technology firms, security vendors, and critical-infrastructure providers via a consortium named Project Glasswing. Early testing, including internal and third-party reviews, reportedly surfaced thousands of previously unnoticed vulnerabilities across multiple major operating systems and web browsers.
Technical details
Mythos is described as markedly stronger at long-range, multi-step reasoning and code analysis than prior Claude variants. Logan Graham, Anthropic, said, "It's just generally better at pursuing really long-range tasks that are kind of like the tasks that a human security researcher would do throughout the course of an entire day." The key capabilities highlighted are:
- •automated static and dynamic analysis across large codebases, accelerating vulnerability discovery
- •automated exploit generation given a discovered vulnerability, lowering the skill floor for attackers
- •context-aware synthesis that chains reconnaissance, payload construction, and exploitation steps
These capabilities matter because they compress what traditionally required specialist knowledge, time, and manual tooling into fast, prompt-driven workflows. Anthropic is using the preview to run coordinated discovery and remediation exercises with partners including Microsoft, Apple, Amazon Web Services, and CrowdStrike.
Context and significance
This is a watershed moment for both model safety and cybersecurity. Past model rollouts have been staged for safety; Mythos is the first high-profile case where a company publicly argues a model is too dangerous for general release because it materially lowers barriers to cyber offense. The fallout spans three domains:
- •offense: automated vulnerability-to-exploit pipelines can scale attacker operations and diversify threat actors
- •defense: defenders must adopt model-assisted scanning and patching, and retool continuous integration pipelines
- •governance: regulators and financial-sector bodies have already convened emergency meetings to assess systemic risk
The decision also highlights a trade-off between centralized control and transparency. Keeping models closed reduces immediate proliferation risk but concentrates decisioning power in vendors and their chosen partners. Expect tensions between responsible disclosure norms, open research communities, and national-security stakeholders.
What to watch
Project Glasswing outputs and the partner reports will be the first empirical measure of Mythos capabilities and whether its harms can be mitigated operationally. Watch for regulatory reactions from finance and national-security agencies, industry adoption of model-assisted hardening tools, and any independent replication by other labs. If replication occurs, the security community will need rapid, interoperable standards for model use, exploit disclosure, and access controls.
Scoring Rationale
A major model preview that materially changes the threat landscape is industry-shaking. Anthropic's public withholding, cross-industry preview, and immediate regulator attention make this highly consequential for practitioners and policymakers.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



