Anthropic's Mythos Preview tops SWE-bench benchmarks

Anthropic's Mythos Preview achieves 93.9% on SWE-bench Verified, surpassing Opus 4.6's 80.8%. On SWE-bench Pro Mythos scores 77.8% versus Opus's 53.4%. VentureBeat's Michael Nuñez presents these comparative benchmark results, showing a substantial performance gap on SWE-bench evaluations.
Scoring Rationale
Notable and actionable performance differences on a software-engineering benchmark make this relevant to practitioners, but the headline lacks methodological context and fuller evaluation details.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.
Sources
- Read OriginalTechmeme: Anthropic says Mythos Preview achieves 93.9% on SWE-bench Verified, compared with 80.8% for Opus 4.6, and 77.8% on SWE-bench Pro, versus 53.4% for Opus 4.6 (Michael Nuñez/VentureBeat)techmeme.com


