Roll Over von Neumann? Samsung Claims Progress on the Compute-Memory Divide for HPC, AI

Print Friendly, PDF & Email

Samsung is claiming progress on the ages-old compute-memory bottleneck inherent in the classical von Neumann computing architecture. The company has announced what it said is the industry’s first High Bandwidth Memory (HBM) integrated with artificial intelligence (AI) processing power — the HBM-PIM. The company said the processing-in-memory (PIM) architecture brings AI computing capabilities inside high-performance memory to accelerate large-scale processing in data centers, high-performance computing (HPC) systems and AI-enabled mobile applications.

According to Samsung, PIM takes on a key disadvantage of the von Neumann architecture, which uses separate processor and memory units to execute data processing tasks, a sequential processing approach that requires data to move back and forth between compute and memory. In data-intensive HPC and AI workloads, system-slowing bottlenecks are the inevitable result.

But HBM-PIM places a DRAM-optimized AI engine inside each memory bank — a storage sub-unit — enabling parallel processing and minimizing data movement, according to Samsung. When applied to the company’s existing HBM2 Aquabolt solution, the new architecture delivers more than twice the system performance while reducing energy consumption by about 70 percent, according to Samsung. Also, the HBM-PIM does not require hardware or software changes, allowing faster integration into existing systems, the company said.

“I’m delighted to see that Samsung is addressing the memory bandwidth/power challenges for HPC and AI computing,” said Rick Stevens, associate laboratory director for computing, environment and life sciences at Argonne National Laboratory. “HBM-PIM design has demonstrated impressive performance and power gains on important classes of AI applications, so we look forward to working together to evaluate its performance on additional problems of interest to Argonne National Laboratory.”

Added Karl Freund, founder and principal analyst at Cambrian-AI Research, said, “This is an innovative and needed step to remove some of the memory bandwidth and capacity limitations inherent in solving the large AI problems in front of us.   It won’t be a cure-all, but Samsung is heading in the right direction.”

John von Neumann

Samsung’s paper on the HBM-PIM has been selected for presentation at the International Solid-State Circuits Virtual Conference (ISSCC) held through Feb. 22. HBM-PIM is now being tested inside AI accelerators by leading AI solution partners, with all validations expected to be completed within the first half of this year.

Mark Nossokoff, senior analyst at HPC industry watcher Hyperion Research, said he is conceptually impressed with Samsung’s announcement, cautioning that Samsung has so far released only limited information about HPB-PIM.

“Anything that can be done to cost effectively optimize and minimize data movement will be welcomed by everyone — users and data center managers alike,” he said. “What I find an intriguing innovation is integrating the AI processing element within the memory subsystem, particularly if that can be done with no changes to users’ code, as as I believe they’re claiming. Integration and adoption of new technologies is great, but if users have to change their codes, then it’s going to be slow going and a hard row to hoe. But if that can be implemented and the performance claims are close to what they’re, they’re suggesting, I could see it being easily and rapidly adopted.”