Apple adopts Nvidia chips for Gemini-powered Siri

The Information reports that Apple's overhauled, Gemini-powered Siri will route some cloud queries to Google Cloud and run them on Nvidia Blackwell B200 GPUs, according to 9to5Mac and MacRumors coverage of the paywalled report. Apple will reportedly enable Nvidia's hardware-based confidential compute feature, which encrypts data while it is processed on the chips. The Information says the arrangement diverges from Apple's usual strategy of controlling all the critical ingredients of its products, and notes it is unclear how Apple's previously launched Private Cloud Compute system fits the upcoming Siri launch; Apple reportedly tested a modified Gemini on that in-house system but found it ran too slowly. Nvidia describes confidential compute as preserving "the confidentiality and integrity of AI models deployed on Rubin, Blackwell, and Hopper GPUs." The revamped Siri is expected to launch in September, after WWDC 2026 opens June 8.
What happened
The Information reports, as relayed by 9to5Mac and MacRumors, that Apple will route some queries for an overhauled version of Siri to Google Cloud and run them on Nvidia Blackwell B200 GPUs, using a licensed deployment of Google's Gemini model. Apple has reportedly approved Nvidia's confidential compute feature to encrypt data while it is processed on those GPUs. The Information says the choice diverges from Apple's usual strategy of "attempting to control all the critical ingredients to its products," and notes it is unclear how Apple's existing Private Cloud Compute server system will fit into the upcoming Siri launch. Per the report, Apple tried running a modified Gemini on Private Cloud Compute but found it ran too slowly. The revamped Siri is expected to launch in September; WWDC 2026 opens June 8.
Technical context
Nvidia positions the Blackwell B200 as a flagship data-center GPU for large-scale training and inference, and the successor to Hopper, with gains in inference throughput, memory bandwidth, and multi-GPU scaling. Confidential compute is a hardware-based capability that isolates and encrypts data during on-chip processing; Nvidia says it "preserves the confidentiality and integrity of AI models deployed on Rubin, Blackwell, and Hopper GPUs," allowing "sensitive AI workloads to run securely at scale with near-native performance, even in shared or cloud environments."
Editorial analysis - industry pattern
Organizations deploying large foundation models routinely split work between on-device execution and cloud-hosted inference to trade off latency, model capacity, and privacy. Using cloud GPUs with confidential compute is an increasingly common way to access very large models while keeping cryptographic protections over data in use. As a general pattern, renting a cloud provider's GPU fleet accelerates access to cutting-edge hardware but adds operational dependencies on both the cloud and GPU vendor ecosystems.
What to watch
- •the latency and cost profile of queries routed to cloud-hosted Gemini inference on Blackwell hardware
- •attestation or audit details showing how confidential compute is implemented and verified
- •how workloads are divided between on-device Siri components and cloud Gemini inference
- •whether Apple confirms any of this at WWDC, and how Private Cloud Compute is repositioned
Scoring Rationale
A major consumer-device vendor reportedly renting cloud Nvidia Blackwell GPUs and adopting confidential compute for Gemini-powered Siri is a notable strategy and deployment signal for practitioners weighing on-device versus cloud inference and privacy tradeoffs. It is well corroborated across outlets but traces to a single original report and remains unconfirmed by Apple ahead of WWDC, keeping it notable rather than industry-shaking.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

