Inception Labs Launches Mercury 2 Diffusion LLM

Last week Inception Labs launched Mercury 2, a diffusion-based large language model that generates over 1,000 tokens per second and delivers five to ten times lower end-to-end latency than speed-optimized autoregressive models, CEO Stefano Ermon told The New Stack. Mercury 2 is available via an OpenAI-compatible API, with AWS Bedrock integration coming soon, targeting faster, cheaper inference for reasoning workloads.
Scoring Rationale
High novelty and usable release, scored high despite being a single-company claim with limited independent benchmarks.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
