Anthropic Releases Opus 4.7 Prompting User Backlash
Anthropic launched Opus 4.7, a new Claude-family model, and a subset of power users report regressions, higher token costs, and breaking API changes. Complaints center on reduced default reasoning depth, hidden or charged "thinking" tokens, and a changed tokenizer that inflates usage by roughly 35% for some workloads. Anthropic engineering staff, led publicly by Boris Cherny, say the change is configurable via the model selector and linked to product choices, not compute scarcity. Developers worry about unpredictable costs, broken integrations like budget_tokens returning errors, and weakened performance on complex coding and research tasks. The discord highlights a broader tradeoff: pushing frontier models like Mythos while maintaining consistent behavior for high-volume, reliability-sensitive users ahead of a potential IPO and continued enterprise adoption.
What happened
Anthropic released Opus 4.7, an update to the Claude family, and a vocal segment of power users reported performance regressions, higher token consumption, and breaking API behavior. Posts on X, Reddit, GitHub, and developer blogs document cases where the model appears to deliver lower-quality outputs on complex engineering tasks, charge more tokens per prompt, and refuse previously used API calls like budget_tokens. The company is facing a perception problem at the same time it is pushing frontier work like Mythos and courting enterprise customers ahead of a potential IPO.
Technical details
The complaints coalesce around a few concrete changes and behaviors that affect engineering workflows. Users and community posts identify three main technical pain points:
- •A new tokenizer and accounting that some users say increases token consumption by roughly 35% for many prompts, directly raising cost for high-volume workloads.
- •Hidden or reclassified "thinking" tokens tied to the model's adaptive reasoning, which are not always visible to users and therefore obscure actual billing and latency characteristics.
- •API and configuration changes, including reports of budget_tokens returning a 400 error and a changed default effort level that reduces the model's default reasoning depth.
Anthropic engineering, led publicly by Boris Cherny, has said the model selector exposes lower-effort and higher-effort modes and that the setting is sticky across sessions. The company denies this was a secret nerf or solely a compute rationing move, but acknowledges adjustments to default behavior in claude.ai and Claude Code.
Context and significance
This episode highlights recurring tensions for model providers: the tradeoff between pushing capabilities and providing stable, predictable service for developers. The move toward more agentic, compute-intensive models like Mythos increases internal pressure to reallocate capacity or tune defaults to control costs. For power users who depend on consistent reasoning depth and predictable token accounting, these product changes feel like regressions and can quickly become trust issues. Anthropic's brand has emphasized transparency and user alignment; the perception of undocumented or disruptive changes opens reputational risk at the exact moment the company needs stability for enterprise contracts and public-market credibility.
What to watch
Short term, look for Anthropic to update changelogs, restore or clarify budget_tokens behavior, and provide clearer tooling to measure "thinking" token usage. Medium term, watch whether Anthropic adjusts pricing, offers opt-in high-effort modes by default, or provides migration tooling for heavy users. If dissatisfaction persists, expect enterprises to push for contractual SLAs or seek alternative models with more predictable billing.
Bottom line
The technical issues are fixable, but the real cost is to developer trust. Teams running production workloads should audit token usage immediately, test Opus 4.7 under realistic load, and plan for fallback strategies until Anthropic clarifies billing and configuration semantics.
Scoring Rationale
The story matters to ML practitioners because it directly affects developer cost, reliability, and production workflows for a widely used model family. It is not a frontier research milestone, but the product-level change plus potential reputational and billing impacts make this notable for teams running heavy Claude-based workloads.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



