Intel unveils 36,864-core rack for agentic inference

At Computex 2026, Intel unveiled rack-scale reference designs that, The Register reports, can host up to 128 of either its 128-core Granite Rapids Xeon 6 or 288-core Clearwater Forest CPUs, producing between 16,384 and 36,864 cores and supporting up to 384 TB of DDR5 inside a 100kW rack. Intel's press release (Business Wire / Intel newsroom) describes the initiative as a new rackscale AI infrastructure built with Intel Xeon processors and SambaNova SN-50 RDUs, and names partners including Foxconn, Siemens, and Hitachi. The Register reports that Intel and SambaNova's earlier disaggregated inference blueprint, which offloads prefill work to Nvidia GPUs and uses SambaNova accelerators for decode, has Vector Core Compute as an early deployer and Together.AI as the first commercial customer. The Register quotes Intel CEO Lip-Bu Tan: "Our customers are asking us to think at the system level to help them serve real agentic workloads at scale."
What happened
Intel announced new rack-scale reference designs at Computex 2026 for inference and agentic workloads, according to Intel's press release (Business Wire / Intel newsroom) and live reporting in The Register. The Register reports the two reference designs target latency-sensitive agentic workloads and maximum-density deployments and support up to 128 of either 128-core Granite Rapids Xeon 6 or 288-core Clearwater Forest CPUs, producing between 16,384 and 36,864 cores and up to 384 TB of DDR5 memory inside a 100kW power envelope. Intel's release lists ecosystem partners including SambaNova, Foxconn, Siemens, Hitachi, Echo Neurotechnologies, and Greenstone Biosciences (Business Wire). The Register quotes Intel CEO Lip-Bu Tan: "Our customers are asking us to think at the system level to help them serve real agentic workloads at scale," and The Register reports Tan expects systems based on these reference designs to be broadly available from ODM and OEM partners.
Technical details
Editorial analysis - technical context: public reporting describes two complementary hardware targets, one optimized for latency-sensitive orchestration and one for maximum CPU density. The dense design quoted by The Register emphasizes a high core count and large DDR5 capacity in a constrained 100kW envelope, which raises familiar engineering tradeoffs around cooling, memory locality, and interconnect bandwidth.
Industry-pattern observations: reporters note the announcement sits alongside other vendor moves. The Register compares Intel's rack to Nvidia's rack-scale CPU platform using Vera CPUs and to Arm designs it reports for air-cooled and liquid-cooled racks. ServeTheHome's Computex coverage frames the CPU-focused push as part of Intel's response to a data center market that had been dominated by GPUs for training (ServeTheHome).
Disaggregated inference and early adoption
SambaNova and Intel have previously described a disaggregated inference blueprint that desegregates compute-intensive prefill operations to Nvidia GPUs while using SambaNova RDUs for bandwidth-intensive decode operations; The Register reports that approach can boost per-user token output by 2-3x. Intel's press materials identify a new entrant, Vector Core Compute, a purpose-built enterprise inference cloud using Intel Xeon processors, SambaNova RDUs, and Nvidia Blackwell GPUs (Business Wire). The Register reports Together.AI as Vector Core Compute's first commercial customer.
Context and significance
the announcements reflect a broader pattern in which "agentic" AI workloads increase demand for high-throughput, low-latency CPU orchestration at scale rather than raw training GPU capacity. Observers covering the sector have noted a shifting balance between accelerators and host CPU resources as production inference and multi-component agent stacks proliferate (ServeTheHome, The Register, Intel press release).
What to watch
For practitioners: monitor three observable indicators. First, availability and SKUs from OEM/ODM partners named by Intel, which will determine procurement timelines and supported form factors. Second, independent performance and efficiency benchmarks that compare dense Xeon racks against GPU-heavy and RDU-augmented disaggregated inference configurations. Third, early customer deployments from Vector Core Compute and Together.AI for real-world throughput, latency, and cost-per-token measurements; these rollouts will provide the clearest signals about practical tradeoffs for agentic inference.
Editorial analysis: while vendor reference designs set architectural intent, the operational value for production workloads will depend on measured throughput per watt, software stack maturity for multi-node CPU orchestration, and integration of disaggregated decode pipelines. Independent testing and vendor-neutral benchmarks will be essential for practitioners evaluating dense-CPU vs accelerator-led deployment models.
Scoring Rationale
Notable infrastructure announcement: dense Xeon racks and a vendor-backed disaggregated inference blueprint matter to practitioners planning inference deployments. The impact depends on independent benchmarks and OEM availability.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
