Sunday night in the United States, on a Computex stage in Taipei, Jensen Huang made a claim that, a year ago, would have sounded like a category error. The thing he was describing was not a data center rack or a desktop tower bolted to a wall outlet. It was a laptop. And Nvidia says it can run a 120-billion-parameter language model, with a context window stretching to a million tokens, without ever touching the cloud.
The chip is called the RTX Spark Superchip, known internally for months by its codename N1X. Huang framed it as the start of a new kind of personal computer, one built for AI agents rather than for the mouse and keyboard that have defined the PC for 40 years. The marketing is loud. The hardware underneath it is the part that should make machine learning engineers sit up.
The Specs Are a Workstation Folded Into a Notebook
For practitioners, the headline number is memory. The RTX Spark pairs a custom Arm CPU with a Blackwell-class GPU and gives them a shared pool of 128GB of LPDDR5X unified memory, the same architectural trick that lets Apple Silicon and Nvidia's own DGX Spark desktop hold large models in a single addressable space. Local inference lives and dies by how much memory you can put a model into, and 128GB is enough to hold a 120-billion-parameter model in a usable quantized form.
The rest of the sheet is built to feed that memory.
| Component | RTX Spark Superchip |
|---|---|
| CPU | Up to 20 Arm cores, co-designed with MediaTek |
| GPU | Blackwell-class, 6,144 CUDA cores |
| Unified memory | 128GB LPDDR5X |
| Memory bandwidth | Up to 300 GB/s |
| CPU-to-GPU link | NVLink C2C |
| Claimed local model size | Up to 120 billion parameters, context to 1 million tokens |
The CPU and GPU are joined over NVLink C2C, Nvidia's high-speed on-package interconnect, so data does not crawl across a slow bus between the two. Memory bandwidth tops out around 300 GB/s. Those two facts, the wide link and the fast memory, are what make the difference between a laptop that can technically load a large model and one that can actually run it at a tolerable speed.
Nvidia called the result "the most efficient platform ever built." That is a vendor's claim, not a benchmark, and the real test will come when independent reviewers get hardware in the fall. But the architecture is not vaporware. The same Grace Blackwell lineage already ships in the data-center chips Nvidia put into production this year; the RTX Spark is that design recast for a thin chassis.
Why On-Device Inference Is the Real Story
Strip away the gaming and creative pitches and the practitioner value is direct: a model that fits in 128GB runs on your machine, on your data, with no API meter ticking and no round trip to a server.
That has three consequences ML engineers will feel immediately. Latency drops, because there is no network hop. Privacy improves, because sensitive data never leaves the device, which matters for anyone working under data-handling rules that make cloud inference a compliance headache. And the marginal cost of a query falls to roughly the electricity it consumes, which is a different economics entirely from paying per token to a hosted endpoint.
The catch has always been that capable local inference required either a desktop with a power-hungry discrete GPU or a model small enough to disappoint. The frontier of what runs locally has been moving fast. We have already seen a 9-billion-parameter model on a phone outperform a 120-billion-parameter cloud model on specific tasks, and quantization techniques that squeeze large models onto consumer hardware have improved steadily. The RTX Spark pushes the ceiling up rather than the model size down: it aims to run the big model itself, locally, in a machine you can close and put in a bag.
That is the bet. Whether the open-weight models worth running at 120 billion parameters keep pace is a separate question, though the open-source field in 2026 has given local users more to work with than ever.
The Software Push Is as Aggressive as the Hardware
Nvidia did not stop at silicon. It is trying to remake the Windows experience around agents.
In partnership with Microsoft, the company introduced an OpenShell framework and what it calls "a new set of security primitives," guardrails meant to ensure that local agents and models can only reach the tools and data a user explicitly grants them. The pitch is that an agent should be able to set goals, call tools, evaluate its own output, and keep working on long tasks overnight while you are away, all without quietly gaining the run of your machine. Microsoft is expected to detail more of this agentic Windows vision at its upcoming Build conference.
The creative tools got attention too. Nvidia says it is working with Adobe to rebuild the core of Photoshop into a fully GPU-accelerated application for RTX Spark, with Premiere getting a similar overhaul that exposes Model Context Protocol controls so AI agents can drive the software directly. On the consumer side, Nvidia promises 100 FPS gaming at 1440p, helped by DLSS 4.5.
The hardware ships broadly. Nvidia expects more than 30 laptops and around 10 compact desktops from Dell, HP, Lenovo, Microsoft, Asus, and MSI, with the first systems arriving in the fall of 2026. The laptops will be thin, with OLED displays and all-day battery life, and Microsoft's own entry is expected under its Surface line.
The Other Side: Reasons for Caution
The skepticism writes itself, and it is worth taking seriously.
Pricing is the first unknown, because Nvidia did not announce it. The closest reference points are not encouraging for budget buyers. The DGX Spark desktop launched near four thousand dollars and has since crept toward five thousand, and an early N1 laptop board leaked with a $1,400 sticker. LPDDR5X memory and 3nm manufacturing are both expensive right now, which points toward premium prices that put these machines well above a mainstream laptop.
The second caution is history. Windows on Arm has been declared the future before and stumbled on software compatibility each time, as one Microsoft veteran pointedly recalled this week. Nvidia is betting the agentic-AI era finally gives Arm on Windows a reason to win that earlier attempts lacked, but app compatibility and driver maturity are exactly where these platforms have historically bled.
Nvidia is also not alone in chasing on-device AI memory. At the same Computex, Intel detailed its Crescent Island AI GPU with up to 480GB of memory aimed at inference, a reminder that the local-AI hardware race is widening, not settling. And every performance figure so far comes from Nvidia's own slides. The "most efficient platform ever built" line will mean something only after someone outside the company measures it.
The Bottom Line
For years, "run it locally" meant accepting a smaller, weaker model or building a desktop that doubled as a space heater. Nvidia's pitch with the RTX Spark is that the compromise is ending: a 120-billion-parameter model with a million-token context, on battery, in a laptop, no API key required.
If the hardware delivers what the keynote promised, the calculus for a lot of ML work shifts. The default reflex to reach for a hosted endpoint weakens when the same model runs on the machine already in front of you, faster and private and effectively free per query after the upfront cost. That upfront cost, still unannounced, is the catch that could keep this aspirational for everyone but the well-funded.
Huang has been selling the agentic future from a data-center stage for two years. This fall, for the price of a premium laptop, he is offering to put a slice of it on your desk. The models, the prices, and the reviewers will decide whether anyone should take him up on it.
Sources
- Nvidia unveils RTX Spark Superchip for laptops and desktop PCs at Computex 2026 — Tom's Hardware, Jeffrey Kampman (June 1, 2026)
- Nvidia jumps into PCs with new Arm-based chip debuting in laptops from Microsoft, Dell, HP — CNBC (May 31, 2026)
- Nvidia RTX Spark Superchip: Windows PC Chip With Full CUDA Stack Targets Dell, Microsoft This Fall — TechTimes (June 1, 2026)
- Nvidia's Grace Blackwell superchips are officially coming to the PC with RTX Spark notebooks — The Register (June 1, 2026)
- Nvidia N1X officially confirmed to arrive as the RTX Spark — Notebookcheck (June 1, 2026)