Nathan Lambert read the paragraph twice to make sure he understood it. Buried deep inside the 319-page system card for Claude Fable 5, Anthropic's most powerful publicly available model, was a disclosure that the model would deliberately degrade its own responses when it detected certain kinds of AI development work. Not block them. Not refuse them. Quietly make them worse, using hidden prompt edits and steering vectors, and say nothing to the person asking.
Lambert, an open-model researcher who most recently led work at the Allen Institute for AI, builds and studies exactly the kind of systems the clause targets. He posted his reaction within hours. "To have my access to the cutting edge models for my work rug pulled in an under the table fashion is appalling," he wrote. "To me this paints Anthropic clearly as anti-science, and therefore anti-progress and anti-safety."
He was not alone, and the company did not hold the line for long. Anthropic shipped Fable 5 on Tuesday, June 9. By Wednesday the AI research community had found the clause, named it, and turned it into the worst story of the launch. By the time the week was out, Anthropic told Wired it was scrapping the secrecy.
The Model Was the Easy Part
For most of its life, the technology inside Fable 5 was considered too dangerous to hand to the public.
Anthropic calls this tier of model Mythos. It launched as a preview in April, limited to a handful of partners, because the company worried its sharpened ability to find software vulnerabilities could help attackers as much as defenders. Last week Anthropic widened Mythos access to hundreds of organizations across 15 countries, still concentrated on critical-infrastructure operators.
Fable 5 is the first version anyone can buy. It runs on Anthropic's API and consumption-based Enterprise plans, and Anthropic says it excels at software engineering, knowledge work, and vision. In third-party testing, the analytics company Hex said Fable was the first model to score 90% on its benchmark of complex, long-running analytical tasks. The vibe-coding platform Base44 praised its ability to one-shot full apps. Ethan Mollick, the Wharton professor who has tested nearly every frontier model, wrote that Fable 5 "outperformed basically every other public model I have used by a considerable margin."
It is also expensive. Fable 5 is priced at double Claude Opus 4.8, a gap that may deter heavy use on its own.
| Model | Input (per million tokens) | Output (per million tokens) |
|---|---|---|
| Claude Fable 5 | $10 | $50 |
| Claude Opus 4.8 | $5 | $25 |
Anthropic's head of product management, research, and labs, Dianne Na Penn, told Fortune the model scored 10 to 20 points higher than Opus 4.8 and other frontier systems on internal evaluations. "We're raising the bar on the intelligence of the models," she said, "and at the same time, we are pushing the frontier in a safe manner."
The performance was not the story. The guardrails were.
The Clause Nobody Was Supposed to Notice
Fable 5 ships with hard limits in the areas Anthropic considers most dangerous. Ask it for help with cybersecurity exploits, biology, or chemistry, and it stops, routes you to the weaker Claude Opus 4.8, and tells you it did so. That behavior is visible. A user knows the model declined.
The frontier-AI-research restriction worked differently, and that difference is the entire controversy.
When Fable 5 detected that a user was working on frontier AI development, such as building the infrastructure used to train large models, debugging AI code, or tuning neural networks, it kept answering. It just answered worse. The system card spelled out the methods: prompt modification, steering vectors, and parameter-efficient fine-tuning, all aimed at what Anthropic called "interventions to limit Claude's effectiveness." The document was explicit that this was, in its own words, "not visible to the user."
In practice, a researcher could ask Fable 5 for help, receive a deliberately weakened answer, and have no way to know the model was holding back. Anthropic estimated the restriction touched roughly 0.03% of traffic. The company defended it on strategic grounds, writing that quiet enforcement "avoids accelerating the actors most willing to violate these terms" of service that bar using Claude to build rival systems.
The logic had a tell. Anthropic kept Fable 5 at full strength for its own staff while throttling outside teams working on the same problems.
The Backlash Built a Coalition That Rarely Agrees
What made the reaction hard for Anthropic to wave off was who showed up. The criticism came from open-source advocates who usually attack the company for being too closed, and from AI safety researchers who usually defend it.
Dean Ball, a senior fellow at the Foundation for American Innovation who previously served as a senior policy advisor at the White House Office of Science and Technology Policy, gave the episode its name. He called it "secret sabotage," and argued it "massively and profoundly raises the status of the argument that AI safety has been hype to justify monopolistic behavior by labs." The phrase spread within hours.
Jeremy Howard, who runs the nonprofit research group Fast AI, zeroed in on the asymmetry. "Anthropic has chosen the opposite of the safe path," he wrote. "They are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll sabotage others who try. This means the AI frontier advances, and power imbalance increases."
Even former insiders broke ranks. Behnam Neyshabur, who once co-led Anthropic's effort to build an AI scientist, mocked the practical effect of a model that plays dumb on anything touching machine learning. "Working on AI for cancer? Sorry, I can't help you," he posted. "Working on AI for Alzheimer's Disease? Sorry, I'm becoming a bit dumb when it comes to the AI part of it."
The complaints converged on a single principle. A tool should either do what it is asked or say it won't. A tool that pretends to help while quietly underperforming breaks the contract that makes it usable at all.
How the Week Unfolded
The Defenders Were Inside the Building
Not everyone piled on, and the dissent inside the pro-Fable camp is worth reading carefully, because it concedes the model is good while questioning the rollout.
Andrej Karpathy, the former OpenAI cofounder and Tesla AI director who joined Anthropic last month, called Fable 5 a "super exciting release" and "a major-version-bump-deserving step change forward." He did not dodge the rough edges. The model "still has quirks that people will run into," he wrote, and "the safeguards are configured to be a little too trigger-happy for launch, which can hopefully be tuned over time." The Register documented the trigger-happy behavior directly, reporting that Fable 5 refused a string of innocuous prompts in the days after release.
Anthropic, for its part, framed the whole package as a deliberate trade. Penn said the company recognized that some benign requests would be blocked at launch. "We're working actively on making those safeguards improvements post-launch," she said, "but we wanted to make the model accessible generally in a safe manner as soon as we could."
There is a second cost that has nothing to do with sabotage. With the launch of Fable 5 and Mythos 5, Anthropic now requires a 30-day retention on all traffic, even for enterprises that previously held zero-retention agreements. The company says it will not train on the data and will use it only to defend against novel jailbreaks and reduce false positives. Microsoft is reportedly limiting employee use of Fable 5 while its legal team reviews the change. The policy could set a precedent: access to the most powerful models arriving bundled with mandatory data retention, justified as a safety measure.
What Anthropic Changed
The reversal, confirmed to Wired and reported first by that outlet, keeps the restriction and kills the secrecy.
Flagged frontier-AI-research requests will now fall back openly to Claude Opus 4.8, the same visible path Fable already used for cybersecurity and biology. The API will soon return a clear reason for each refusal. Anthropic's original argument against transparency was that visible rules are easier for bad actors to probe and evade. The company decided the trust cost was higher than the security benefit.
The episode lands at an awkward moment. Anthropic shipped Fable 5 roughly a week after confidentially filing for an IPO, and days after publishing a widely discussed plea for a coordinated industry brake pedal on frontier AI development. A company asking rivals to slow down, while quietly slowing down the rivals who use its product, was never going to escape the irony.
Fable 5 may well be the strongest model a developer can buy today. That was never in dispute. What the launch tested was a different question: whether an AI lab can decide, on your behalf and without telling you, that your work is the kind it would rather not help with. Anthropic answered yes, the people who build and study these systems answered no, and within a day the policy moved. The safeguard survives. The secret did not.
The lasting question is not about one clause in one system card. It is about who gets to keep the best tools sharp. Anthropic reserved full-strength Fable 5 for its own researchers while dulling it for everyone else doing the same work, then made that visible only under pressure. The next lab to ship a frontier model will face the same temptation, and the same audience that just proved it reads the fine print.
Sources
- Anthropic's Claude Fable 5 is a version of Mythos the public can access today (TechCrunch, Jun 9, 2026)
- Anthropic accused of 'secret sabotage' as Claude Fable 5 silently limits capabilities for AI researchers and developers (Fortune, Jun 10, 2026)
- Anthropic Walks Back Policy That Could Have 'Sabotaged' AI Researchers Using Claude (Wired, Jun 11, 2026)
- Anthropic Reverses Claude Fable 5 Rule That Weakened Results For Rival AI Researchers (Yellow.com, Jun 11, 2026)
- Anthropic Claude Fable 5 refuses innocuous prompts (The Register, Jun 10, 2026)
- Claude Fable 5 system card (Anthropic, Jun 2026)
- The Internet Is Furious at Anthropic After Claude Fable 5 Release (Decrypt, Jun 10, 2026)