Google Cloud Expands Nvidia GB300 Access for Thinking Machines

Google Cloud has signed a new agreement to expand Thinking Machines Lab's access to NVIDIA Blackwell GB300 GPUs via A4X Max on the Google Cloud AI Hypercomputer. Thinking Machines will run GB300 NVL72 instances and early tests show a 2X improvement in training and serving speed versus prior-generation GPUs, enabled by Google Cloud's Jupiter network for rapid weight transfers. The deal pairs high-throughput compute with Google Cloud services like GKE, Spanner, Cluster Director, Cloud Storage, and Anywhere Cache to support continuous training alongside production serving. The partnership highlights hyperscaler efforts to deliver rack-scale, NVIDIA-optimized infrastructure for frontier model research and reinforcement learning workloads.
What happened
Google Cloud signed an agreement to expand Thinking Machines Lab's footprint on the Google Cloud AI Hypercomputer, giving the startup priority access to A4X Max VMs powered by NVIDIA Blackwell GPUs, including GB300 NVL72. Early internal testing with the new A4X Max configuration produced a 2X speedup in training and serving compared with prior-generation GPUs. Google Cloud positions this as one of the first commercial deployments of GB300 NVL72 through its platform.
Technical details
The performance uplift comes from a combination of hardware and network co-design. A4X Max packs NVIDIA Blackwell-class accelerators and is integrated with Google Cloud's Jupiter high-bandwidth, low-latency network to enable near-instantaneous weight transfers, which is particularly important for large reinforcement learning and continuous training workloads. Thinking Machines pairs compute with an integrated set of Google services to support both research and production workloads. Key components include:
- •GKE for massive-scale orchestration and cluster management
- •Spanner for transactional metadata and cross-region consistency
- •Cluster Director for automated remediation and node lifecycle management
- •Cloud Storage plus Anywhere Cache and a custom node-level caching layer to enable continuous training while serving
These elements reduce the operational friction of running frontier models at scale, offloading network, orchestration, and storage engineering to the cloud stack while letting researchers focus on model and algorithmic innovations. The deal also leverages Google Cloud's AI Hypercomputer reference architecture that co-optimizes servers, networking, and software stacks for large-model workloads.
Context and significance
This announcement fits a broader industry trend where hyperscalers and NVIDIA jointly deliver rack-scale, GPU-optimized platforms to capture demand from frontier model builders. Access to GB300 NVL72 matters because Blackwell-series GPUs shift the bottleneck profile of large-model training: more on-GPU memory and higher compute bandwidth reduce the need for model parallelism, and the network becomes the gating factor for weight synchronization and RL-style updates. For practitioners, the combination of A4X Max and Jupiter means fewer engineering cycles spent implementing bespoke weight sharding, streaming layers, or complex caching; instead, teams can prioritize experimentation velocity.
This is also a commercial signal. Cloud providers compete on first-mover access to next-generation accelerators, and early customer wins like Thinking Machines validate the utility of integrated cloud stacks for startups building frontier models. That matters for procurement and architecture decisions: startups and research teams weighing on-prem versus cloud will re-evaluate total cost of ownership when the cloud offers turnkey access to next-gen GPUs plus orchestration, storage, and automated operations.
What to watch
Pricing, committed capacity, and availability windows will determine how broadly this matters beyond early customers. Monitor whether Google extends GB300 NVL72 access across regions and whether competitors match with comparable rack-scale offerings and networking. Also watch for follow-on features such as confidential VM support for Blackwell GPUs and expanded bare-metal A5X variants that target even larger Rubin-class deployments.
Scoring Rationale
This is a notable infrastructure development: it gives an advanced model builder priority access to next-generation Blackwell GPUs and an integrated cloud stack, which materially reduces engineering friction for large-model research. It is not a paradigm shift, but it meaningfully affects how practitioners choose cloud vs on-prem for frontier training.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.


