At Computex on May 31, Jensen Huang unveiled the RTX Spark Superchip: 20 Arm cores, a Blackwell GPU, and 128GB of unified memory in a Windows laptop that Nvidia says can run a 120-billion-parameter model with a million-token context entirely on-device. Over 30 laptops and 10 desktops ship this fall.

Sunday night in the United States, on a Computex stage in Taipei, Jensen Huang made a claim that, a year ago, would have sounded like a category error. The thing he was describing was not a data center rack or a desktop tower bolted to a wall outlet. It was a laptop. And Nvidia says it can run a 120-billion-parameter language model, with a context window stretching to a million tokens, without ever touching the cloud.

The chip is called the RTX Spark Superchip, known internally for months by its codename N1X. Huang framed it as the start of a new kind of personal computer, one built for AI agents rather than for the mouse and keyboard that have defined the PC for 40 years. The marketing is loud. The hardware underneath it is the part that should make machine learning engineers sit up.

The Specs Are a Workstation Folded Into a Notebook

For practitioners, the headline number is memory. The RTX Spark pairs a custom Arm CPU with a Blackwell-class GPU and gives them a shared pool of 128GB of LPDDR5X unified memory, the same architectural trick that lets Apple Silicon and Nvidia's own DGX Spark desktop hold large models in a single addressable space. Local inference lives and dies by how much memory you can put a model into, and 128GB is enough to hold a 120-billion-parameter model in a usable quantized form.

The rest of the sheet is built to feed that memory.

Component	RTX Spark Superchip
CPU	Up to 20 Arm cores, co-designed with MediaTek
GPU	Blackwell-class, 6,144 CUDA cores
Unified memory	128GB LPDDR5X
Memory bandwidth	Up to 300 GB/s
CPU-to-GPU link	NVLink C2C
Claimed local model size	Up to 120 billion parameters, context to 1 million tokens

The CPU and GPU are joined over NVLink C2C, Nvidia's high-speed on-package interconnect, so data does not crawl across a slow bus between the two. Memory bandwidth tops out around 300 GB/s. Those two facts, the wide link and the fast memory, are what make the difference between a laptop that can technically load a large model and one that can actually run it at a tolerable speed.

Nvidia called the result "the most efficient platform ever built." That is a vendor's claim, not a benchmark, and the real test will come when independent reviewers get hardware in the fall. But the architecture is not vaporware. The same Grace Blackwell lineage already ships in the data-center chips Nvidia put into production this year; the RTX Spark is that design recast for a thin chassis.

Why On-Device Inference Is the Real Story

Strip away the gaming and creative pitches and the practitioner value is direct: a model that fits in 128GB runs on your machine, on your data, with no API meter ticking and no round trip to a server.

That has three consequences ML engineers will feel immediately. Latency drops, because there is no network hop. Privacy improves, because sensitive data never leaves the device, which matters for anyone working under data-handling rules that make cloud inference a compliance headache. And the marginal cost of a query falls to roughly the electricity it consumes, which is a different economics entirely from paying per token to a hosted endpoint.

The catch has always been that capable local inference required either a desktop with a power-hungry discrete GPU or a model small enough to disappoint. The frontier of what runs locally has been moving fast. We have already seen a 9-billion-parameter model on a phone outperform a 120-billion-parameter cloud model on specific tasks, and quantization techniques that squeeze large models onto consumer hardware have improved steadily. The RTX Spark pushes the ceiling up rather than the model size down: it aims to run the big model itself, locally, in a machine you can close and put in a bag.

That is the bet. Whether the open-weight models worth running at 120 billion parameters keep pace is a separate question, though the open-source field in 2026 has given local users more to work with than ever.

The Software Push Is as Aggressive as the Hardware

Nvidia did not stop at silicon. It is trying to remake the Windows experience around agents.

In partnership with Microsoft, the company introduced an OpenShell framework and what it calls "a new set of security primitives," guardrails meant to ensure that local agents and models can only reach the tools and data a user explicitly grants them. The pitch is that an agent should be able to set goals, call tools, evaluate its own output, and keep working on long tasks overnight while you are away, all without quietly gaining the run of your machine. Microsoft is expected to detail more of this agentic Windows vision at its upcoming Build conference.

The creative tools got attention too. Nvidia says it is working with Adobe to rebuild the core of Photoshop into a fully GPU-accelerated application for RTX Spark, with Premiere getting a similar overhaul that exposes Model Context Protocol controls so AI agents can drive the software directly. On the consumer side, Nvidia promises 100 FPS gaming at 1440p, helped by DLSS 4.5.

The hardware ships broadly. Nvidia expects more than 30 laptops and around 10 compact desktops from Dell, HP, Lenovo, Microsoft, Asus, and MSI, with the first systems arriving in the fall of 2026. The laptops will be thin, with OLED displays and all-day battery life, and Microsoft's own entry is expected under its Surface line.

The Other Side: Reasons for Caution

The skepticism writes itself, and it is worth taking seriously.

Pricing is the first unknown, because Nvidia did not announce it. The closest reference points are not encouraging for budget buyers. The DGX Spark desktop launched near four thousand dollars and has since crept toward five thousand, and an early N1 laptop board leaked with a $1,400 sticker. LPDDR5X memory and 3nm manufacturing are both expensive right now, which points toward premium prices that put these machines well above a mainstream laptop.

The second caution is history. Windows on Arm has been declared the future before and stumbled on software compatibility each time, as one Microsoft veteran pointedly recalled this week. Nvidia is betting the agentic-AI era finally gives Arm on Windows a reason to win that earlier attempts lacked, but app compatibility and driver maturity are exactly where these platforms have historically bled.

Nvidia is also not alone in chasing on-device AI memory. At the same Computex, Intel detailed its Crescent Island AI GPU with up to 480GB of memory aimed at inference, a reminder that the local-AI hardware race is widening, not settling. And every performance figure so far comes from Nvidia's own slides. The "most efficient platform ever built" line will mean something only after someone outside the company measures it.

The Bottom Line

For years, "run it locally" meant accepting a smaller, weaker model or building a desktop that doubled as a space heater. Nvidia's pitch with the RTX Spark is that the compromise is ending: a 120-billion-parameter model with a million-token context, on battery, in a laptop, no API key required.

If the hardware delivers what the keynote promised, the calculus for a lot of ML work shifts. The default reflex to reach for a hosted endpoint weakens when the same model runs on the machine already in front of you, faster and private and effectively free per query after the upfront cost. That upfront cost, still unannounced, is the catch that could keep this aspirational for everyone but the well-funded.

Huang has been selling the agentic future from a data-center stage for two years. This fall, for the price of a premium laptop, he is offering to put a slice of it on your desk. The models, the prices, and the reviewers will decide whether anyone should take him up on it.

Sources

Nvidia unveils RTX Spark Superchip for laptops and desktop PCs at Computex 2026 — Tom's Hardware, Jeffrey Kampman (June 1, 2026)
Nvidia jumps into PCs with new Arm-based chip debuting in laptops from Microsoft, Dell, HP — CNBC (May 31, 2026)
Nvidia RTX Spark Superchip: Windows PC Chip With Full CUDA Stack Targets Dell, Microsoft This Fall — TechTimes (June 1, 2026)
Nvidia's Grace Blackwell superchips are officially coming to the PC with RTX Spark notebooks — The Register (June 1, 2026)
Nvidia N1X officially confirmed to arrive as the RTX Spark — Notebookcheck (June 1, 2026)

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems

Free Career Roadmaps16 PATHS

Step-by-step roadmaps from zero to job-ready — curated courses, regional market notes, and the exact learning order that builds job evidence.

Global AI acceleration

Explore all career paths

Recommended Reading

Curated articles related to this topic

News

8 min

Meta's Iris Chip Enters Production in September. Broadcom Designed It, Too.

An internal memo reviewed by Reuters shows Meta's custom AI chip, code-named Iris, cleared testing in six weeks and will enter production in September 2026. The chip is part of a plan to double Meta's data center computing capacity to 14 gigawatts by 2027. Broadcom, which designed Iris, is also the design partner behind Google's newest TPU and OpenAI's first custom chip.

Jul 10, 2026

News

10 min

AI Companies Wrote the Rules. Illinois Just Made Them Law.

Governor JB Pritzker signed Illinois Senate Bill 315 into law on July 6, 2026, making Illinois the first state to require annual independent audits of the largest AI models. Companies that violate the law face civil penalties of up to $3 million, with the toughest requirements taking effect January 1, 2028. Illinois joins California and New York in a group that now covers roughly 40% of the US AI market.

Jul 10, 2026

News

9 min

Musk Called Grok 4.5 Opus-Class. Independent Testers Ranked It Fourth.

SpaceXAI launched Grok 4.5 on July 8 with Elon Musk calling it an Opus-class model. Independent testing from Artificial Analysis ranked it fourth, not first, and found its hallucination rate had more than doubled to 54%. Days before launch, Cursor, the coding startup SpaceX bought for $60 billion, had to pull its own benchmark after discovering the model trained on a snapshot of its code.

Jul 10, 2026

News

10 min

Microsoft Cut 4,800 Jobs. Its Memo Insists AI Isn't Why.

Microsoft eliminated about 4,800 jobs on July 6, 2026, concentrated in its Xbox and commercial sales divisions, while Chief People Officer Amy Coleman told staff the cuts were not being replaced by AI. The announcement came four days after Microsoft committed $2.5 billion to a new AI deployment unit and months after the company's first voluntary buyout in 51 years. Four Xbox game studios are being sold off or spun into independence as part of the restructuring.

Jul 9, 2026

News

10 min

Anthropic Markets Itself on AI Safety. It Also Wants Your Passport and Your Face.

Anthropic's updated privacy policy, effective July 8, 2026, lets the company require some Claude Free, Pro, and Max users to submit a government ID and a live selfie for biometric verification. The vendor processing that data, Persona, is backed by Peter Thiel's Founders Fund, which is also an investor in Anthropic. Anthropic says the checks target a small subset of flagged accounts and will not be used to train its models.

Jul 9, 2026

News

11 min

OpenAI's GPT-5.6 Spent 12 Days Waiting on Washington. Grok 4.5 Didn't.

OpenAI's GPT-5.6 family cleared a 12-day government-vetted preview and went fully public on July 9, one day after xAI's Grok 4.5 shipped to everyone with no review at all. The episode marks the first time Washington asked a US lab to restrict a launch before the public ever saw it, two weeks after a similar standoff with Anthropic's Fable 5 and Mythos 5. With Grok 4.5 priced well below Claude Opus 4.8 and beating it on several published benchmarks, price has become as important as capability in choosing a model this week.

Jul 9, 2026

News

5 min

A Startup Moved 100% of Its AI Traffic From Claude to DeepSeek. The Token Data Says It Won't Be the Last.

OpenRouter data shows Chinese AI models overtaking US rivals in raw token usage, processing roughly 18 trillion tokens a week by June 2026 versus about 5.5 trillion for US models, a complete reversal from January. The share of tokens US companies route to Chinese models has peaked at 46 percent, driven almost entirely by price, since Chinese open-weight models run 60 to 90 percent cheaper than leading US systems while closing much of the capability gap. Startups like Lindy have already moved all their traffic to DeepSeek, and experts warn practitioners are being forced into a procurement decision this quarter, not a future hypothetical.

Jul 8, 2026

News

6 min

AI's Four Biggest Players Just Spent $9 Billion Proving the Model Doesn't Matter Anymore

Microsoft unveiled Frontier Company, a 2.5 billion dollar unit led by Rodrigo Kede Lima that embeds more than 6,000 engineers inside customer organizations to deploy AI systems, arriving two days after Amazon's similar 1 billion dollar initiative. In the span of two months, Microsoft, Amazon, OpenAI, and Anthropic have each built a nearly identical forward-deployed engineering business, together committing more than 9 billion dollars to the idea that the real value in AI now lies in deployment rather than the underlying model.

Jul 8, 2026

News

9 min

An AI Agent Ran a Ransomware Attack by Itself. It Forgot to Save the Key.

Sysdig's threat research team documented JADEPUFFER, an autonomous AI agent that exploited a year-old Langflow vulnerability, diagnosed and fixed its own failed login in 31 seconds, and encrypted 1,342 database configuration items before leaving a ransom note. The agent generated its own encryption key, displayed it once, and never saved it, meaning victims cannot recover their data even by paying. Security researchers say it extends a pattern of increasingly autonomous AI-driven attacks rather than starting one.

Jul 8, 2026

News

5 min

Ford Bet on AI to Catch Defects. Then It Rehired 350 Human Engineers.

Ford executives admitted that AI and automated quality systems failed to deliver expected quality levels, prompting the company to hire 350 veteran engineers it calls gray beards. The specialists hunt for failure points, train younger staff, and reprogram the AI tools that fell short. Ford expects the reversal to cut costs by 1 billion dollars this year.

Jul 2, 2026