Understanding the present, shaping the future.

Search
10:17 PM UTC · WEDNESDAY, JUNE 3, 2026 XIANDAI · Xiandai
Jun 3, 2026 · Updated 10:17 PM UTC
AI

Intel Unveils High-Density Xeon Rack Designs for Agentic AI Workloads

Intel debuted rack-scale reference designs at Computex 2026 capable of packing 36,864 CPU cores into a 100kW power envelope to support large-scale agentic AI.

Alex Chen

2 min read

Intel Unveils High-Density Xeon Rack Designs for Agentic AI Workloads
Intel high-density Xeon rack designs displayed at Computex 2026.

Intel officially introduced new rack-scale reference designs at Computex 2026, partnering with Foxconn and other infrastructure providers to enhance CPU compute density for agentic AI. These blueprints are engineered to manage the complex software frameworks, such as OpenClaw, which connect AI models to terminal shells, code interpreters, and external APIs. While AI models typically rely on GPUs, Intel CEO Lip Bu Tan stated that these CPU-centric designs address a critical gap in the infrastructure stack.

“Our customers are asking us to think at the system level to help them serve real agentic workloads at scale,” Tan said during the keynote. The reference designs are highly configurable, supporting up to 128 processors. Depending on the choice of chip, the rack can house either 128-core Granite Rapids Xeon 6 processors or the 288-core Clearwater Forest Xeon 6+ chips, which are built on Intel’s 18A process.

At maximum capacity, these configurations provide between 16,384 P-cores and 36,864 E-cores within a 100kW power envelope. The systems also support up to 384 TB of DDR5 memory. This move places Intel in direct competition with Nvidia, which recently announced its own rack-scale CPU platform featuring 256 Vera CPUs, and Arm, which has introduced reference designs ranging from 8,160 cores in a 36 kW air-cooled system to 45,696 cores in a 200 kW liquid-cooled rack.

Beyond the high-density rack designs, Intel is advancing a disaggregated inference blueprint developed in collaboration with SambaNova. This architecture is designed to optimize inference by separating compute-heavy prefill operations, which are offloaded to Nvidia GPUs, from bandwidth-intensive decode operations handled by SambaNova accelerators. Intel claims this tiered approach can increase per-user token output by two to three times compared to traditional setups.

Commercial adoption of the disaggregated inference platform is already underway, with Vector Core Compute identified as an early deployer and Together.AI confirmed as the first commercial customer. Intel intends to make systems based on these new reference blueprints widely available through its established network of original design manufacturer (ODM) and original equipment manufacturer (OEM) partners.

Comments