Alibaba unveils Zhenwu M890 chip and Qwen3.7-Max LLM

CNBC reports Alibaba unveiled a more powerful AI processor, the Zhenwu M890, and a next-generation large language model, Qwen3.7-Max. CNBC reports the Zhenwu M890 delivers three times the performance of the current Zhenwu 810E, and that the chip offers 144 GB of GPU memory and 800 GB per second interchip bandwidth. CNBC reports Alibaba has delivered 560,000 Zhenwu units to more than 400 customers across 20 industries. CNBC reports the company said the Qwen3.7-Max model will be released soon. Industry context: Companies building domestic AI stacks in regions facing import limits often accelerate combined chip-and-model launches to secure local compute and deployment paths.
What happened
CNBC reports Alibaba revealed a new AI processor, the Zhenwu M890, and previewed a next-generation large language model, Qwen3.7-Max. CNBC reports the Zhenwu M890 delivers three times the performance of the current Zhenwu 810E, and that the new chip provides 144 GB of GPU memory and 800 GB per second interchip bandwidth. CNBC reports Alibaba has already delivered 560,000 Zhenwu units to more than 400 customers spanning 20 industries. CNBC reports the Qwen3.7-Max model will be released soon. CNBC also frames the announcement against limited access to some foreign advanced chips in China.
Technical details
Editorial analysis - technical context: The headline specs-144 GB of GPU memory and 800 GB/s interchip bandwidth-indicate a focus on larger-context, memory-intensive workloads such as multimodal models and long-context LLMs. Industry practitioners will recognise that higher interchip bandwidth matters for multi-GPU model parallelism and for scaling inference with large context windows. Comparable vendor claims of multiple-fold performance jumps commonly depend on workload mix and software stack optimisations, so observed gains in benchmarks can vary across real-world LLM training and inference workloads.
Context and significance
Industry context: Regional supply constraints for advanced GPUs have driven Chinese cloud and hyperscaler players to invest in proprietary silicon and in-house models. The combined announcement of a higher-performance chip and a new model mirrors a pattern where providers control both hardware and model stacks to optimise throughput, latency, and deployment cost. For practitioners, increased local silicon capacity can change options for on-prem and cloud deployments in the region and affect availability of high-throughput inference infrastructure.
What to watch
What to watch: verification of the performance claims in independent benchmarks; published power, price, and thermal numbers for the Zhenwu M890; technical details and benchmark results for Qwen3.7-Max (model size, training data summary, context window, and latency/throughput figures); and how existing customers use delivered Zhenwu units in production workloads. Observers will also watch whether Alibaba or third parties publish reproducible benchmarks comparing the M890 against widely used international GPUs and accelerators.
Scoring Rationale
The combined chip and LLM announcement is notable for practitioners because the reported performance jump and existing unit shipments indicate growing local compute capacity. The story is regionally important and impacts infrastructure choices, but public independent benchmarks and detailed model specs are not yet available, limiting immediate technical impact.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

