NVIDIA Shows GB300 Outperforms GB200 NVL72

LMSYS recently tested NVIDIA's GB300 NVL72 racks against GB200 NVL72 for long-context, latency-sensitive LLM inference, reporting a 1.4–1.5x average performance lead and peak throughput of 226.2 TPS/GPU. Tests showed 1.53x peak throughput, 1.87x TPS/user via multi-token prediction, and 1.58x latency improvements, using PD disaggregation and dynamic chunking. TCO figures were not discussed.
Scoring Rationale
Strong generational performance gains and practical optimizations drive score, limited by single-source benchmarking and absent TCO analysis.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

