Google Unveils AI Hypercomputer Combining TPUs, GPUs, CPUs

Google announced the AI Hypercomputer at Cloud Next 26, a purpose-built data center architecture that merges its new 8th Gen TPU lineup with Axion Cloud CPUs and third-party NVIDIA Rubin GPUs to support the emerging Agentic AI era. The headline silicon is the TPU 8t training chip, which scales to 9,600 chips and 2 petabytes of shared high-bandwidth memory in a single superpod delivering 121 ExaFlops of compute, with native FP4 support and TPUDirect for faster data ingress. The platform promises near-linear scaling with JAX and Pathways, 10x faster storage access, and a system-level focus on utilization. This is a strategic infrastructure play: Google is blending in-house accelerators, CPUs, and NVIDIA GPUs to offer flexible, high-throughput clusters for large-scale model training and agentic workloads.
What happened
Google announced the AI Hypercomputer at Cloud Next 26, a unified datacenter architecture that pairs its new 8th Gen TPU family with Axion Cloud CPUs and NVIDIA Rubin GPUs to target the Agentic AI era. The flagship training silicon, TPU 8t, is advertised to scale to 9,600 chips and 2 petabytes of shared high-bandwidth memory per superpod, delivering 121 ExaFlops of compute and native FP4 arithmetic.
Technical details
The TPU 8t emphasizes throughput, interchip bandwidth, and memory capacity to reduce frontier-model training timelines from months to weeks. Key features called out include:
- •Massive scale: single superpod of 9,600 chips with 2 petabytes of shared HBM and double the interchip bandwidth versus prior gen
- •Data-path optimization: TPUDirect and 10x faster storage access to reduce IO stalls and improve utilization
- •Software scaling: near-linear scaling claims with JAX and Pathways across very large logical clusters
The announcement also introduces a second SKU, TPU 8i, and positions Axion Cloud CPUs for host orchestration and mixed workloads while allowing customers to augment capacity with NVIDIA Rubin GPUs, creating heterogeneous clusters.
Context and significance
This is an infrastructure-first response to agentic, stateful, and large-context models that demand huge memory pools and low-latency interconnects. By combining in-house TPUs with third-party GPUs and custom CPUs, Google is offering both a vertically integrated stack and a hybrid path for customers who need GPU-based tooling. The FP4 support and emphasis on system-level IO are direct optimizations for massive parameter models and retrieval/agent workflows.
What to watch
Actual availability, pricing, and filesystem/network topology details will determine enterprise adoption. Verify real-world scaling beyond vendor benchmarks and how orchestration integrates heterogeneous TPUv8 and Rubin resources into single-job scheduling.
Scoring Rationale
This is a major infrastructure announcement that materially affects how practitioners will approach large-model training and agentic systems. It combines new TPU silicon, CPU orchestration, and NVIDIA GPUs into heterogeneous clusters, which will influence cloud offerings and large-scale training economics.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.


