Google Splits TPUv8 Into Training and Inference Chips

Google is fragmenting its next TPU generation into two purpose-built chips: a training-focused accelerator and a cost-optimized inference accelerator. The training device, codenamed Sunfish and referenced as TPUv8t, will be developed by Broadcom under an extended engagement that covers multiple TPU generations. The inference device, codenamed Zebrafish and referenced as TPUv8i, will be designed by MediaTek. The family will integrate with Google's Axion CPU line and lean on advanced packaging such as CoWoS and HBM stacks, putting pressure on foundry and packaging capacity. The split reflects a broader industry trend toward specialization of compute for training versus inference and will reshape supplier wins, CoWoS demand, and data-center component ecosystems.
What happened
Google is splitting the next-generation TPUv8 family into two distinct chips, a high-performance training accelerator and a cost-optimized inference accelerator. The training chip, TPUv8t, carries the codename Sunfish and is being designed by Broadcom. The inference chip, TPUv8i, carries the codename Zebrafish and is being designed by MediaTek. Google will continue tight systems integration with its Axion CPU line based on Neoverse N3 cores. JPMorgan commentary tied to the deal signals multi-generation scope and substantial revenue upside for infrastructure vendors.
Technical details
The split separates design goals and supply-chain flows. TPUv8t (training) prioritizes raw matrix throughput, multi-socket coherency, and high-bandwidth memory capacity. TPUv8i (inference) optimizes area, power, and cost per inference with likely tighter quantization and latency-focused I/O. Both chips are expected to rely on advanced packaging and HBM stacks, pushing demand for CoWoS-style integration and wafer-level interposers. Key technical points practitioners should note:
- •Broadcom will likely own custom SerDes, PCIe/NVLink-class interconnects, and the high-speed fabric for multi-die training nodes.
- •MediaTek is positioned to optimize die-area, power envelopes, and inference microarchitectures for edge and cloud inference racks.
- •Integration with Axion suggests Google will keep CPU-memory and system orchestration tightly coupled to TPU scheduling and telemetry.
Context and significance
This is a strategic move on three fronts. First, it reflects an industry-wide acknowledgment that training and inference have diverged enough to justify specialized silicon, not a one-size-fits-all accelerator. Second, outsourcing chip design to major contract partners like Broadcom and MediaTek signals Google's pragmatic pivot from fully in-house ASIC design toward an LTA and partner-driven model for scale. Third, it intensifies competition for packaging capacity, especially CoWoS and HBM supply, which were already constrained by GPU and ASIC demand. JPMorgan-linked analysis referenced in market commentary projects that TPU-related hardware and networking revenue could become substantial in the back half of this decade, underscoring why foundries and OSATs are recalibrating capacity.
Why it matters for practitioners
If you run cloud infra, ML platforms, or hardware procurement, expect divergent node designs for training and inference. Training clusters will be denser in HBM capacity and interconnect complexity, while inference clusters will prioritize cost-efficiency and power. Software teams must plan for different compilation targets, quantization paths, and scheduling policies between TPUv8t and TPUv8i nodes. Hardware-software co-design, telemetry, and runtime selection will become more critical to achieve utilization and cost targets.
What to watch
Supply-chain bottlenecks for CoWoS and HBM, the specific interconnect protocols Broadcom implements, and MediaTek's microarchitecture choices for inference. Also monitor contract terms and whether the Broadcom relationship expands into networking components for Google's data centers. These elements will determine deployment cadence and overall TCO for TPUv8 systems.
Scoring Rationale
This rearchitecture and the supplier allocations materially affect data-center hardware design, foundry and packaging demand, and competitive dynamics versus GPU vendors. It signals a significant industry shift but is not a paradigm-breaking research result.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



