NVIDIA Introduces 144-Core Grace CPU ‘Superchip’ for AI, HPC

Print Friendly, PDF & Email

NVIDIA dropped its attempted acquisition of Arm Ltd. earlier this year, but it remains an Arm technology licensee. And today, in a bid to take on the x86 architecture’s long dominance in data center server chips, the company announced its first Arm Neoverse-based data center CPU “designed for AI infrastructure and high performance computing, providing the highest performance and twice the memory bandwidth and energy-efficiency compared to today’s leading server chips,” the company said.

The NVIDIA Grace CPU Superchip comprises two CPU chips connected, coherently, over NVLink-C2C, a new high-speed, low-latency, chip-to-chip interconnect. The CPU packs 144 Arm cores in a single socket, offering estimated performance of 740 on the SPECrate2017_int_base benchmark.(1) NVIDIA said this is more than 1.5x higher compared to the dual-CPU shipping with the DGX A100 today, as estimated in NVIDIA’s labs with the same class of compilers.(2)

NVIDIA said the Grace CPU complements NVIDIA’s first CPU-GPU integrated module, the Grace Hopper Superchip, announced last year, designed to serve giant-scale HPC and AI applications in conjunction with an NVIDIA Hopper architecture-based GPU. Both chips share the same underlying CPU architecture, as well as the NVLink-C2C interconnect.

“A new type of data center has emerged — AI factories that process and refine mountains of data to produce intelligence,” said Jensen Huang, founder and CEO of NVIDIA. “The Grace CPU Superchip offers the highest performance, memory bandwidth and NVIDIA software platforms in one chip and will shine as the CPU of the world’s AI infrastructure.”

The company also touted the CPU’s energy efficiency and memory bandwidth, “with its innovative memory subsystem consisting of LPDDR5x memory with Error Correction Code for the best balance of speed and power consumption.” The LPDDR5x memory subsystem offers double the bandwidth of traditional DDR5 designs at 1 terabyte per second while consuming “dramatically” less power with the entire CPU including the memory consuming just 500 watts, according to NVIDIA.

The Grace CPU Superchip is based on the latest data center architecture, Arm v9, combining the highest single-threaded core performance with support for Arm’s new generation of vector extensions, the company said.

The Grace CPU Superchip will run all of NVIDIA’s computing software stacks, including NVIDIA RTX, NVIDIA HPC, NVIDIA AI and Omniverse. The Grace CPU Superchip along with NVIDIA ConnectX-7 NICs can be configured into servers as standalone CPU-only systems or as GPU-accelerated servers with one, two, four or eight Hopper-based GPUs, allowing customers to optimize performance for their specific workloads while maintaining a single software stack.

NVIDIA said it is working with HPC, supercomputing, hyperscale and cloud customers for the Grace CPU Superchip. Both it and the Grace Hopper Superchip are expected to be available in the first half of 2023.