AMD’s Processor Portfolio and the ‘Pervasive AI’ Landscape

[SPONSORED GUEST ARTICLE]  AI is everywhere, it’s in our homes, cars and jobs, it’s in our healthcare and our entertainment. The organizations we work for are increasingly AI-driven. Geopolitically, the combination of AI and supercomputing is central to national competitiveness and regional security.

From a compute perspective, AI everywhere requires a variety of processors to make it all happen: CPUs, GPUs, adaptable FPGAs and other accelerators across a range of capabilities, form factors and environmental tolerances. It requires chips for training the most complex of neural networks and chips for inferencing data in real time at the ruggedized edge. There’s chips for desktop AI, for AI in the cloud and AI in your handheld device. Putting together coherent AI deployments across this multi-layered landscape is an enormous challenge, stretching the very notion of heterogenous computing.

AMD’s “Pervasive AI” strategy is designed to address these multifaceted demands with a single-vendor, endpoint-to-edge processor portfolio. CEO Lisa Su spoke on this theme in her keynote address at the Consumer Electronics Show in Las Vegas in January.

“Virtually every product, every service, every experience in our lives is powered by semiconductors,” she said. “Whether you’re talking about cloud services, or how we work, game and connect, chips have become a critical enabler of everything in our modern life. And with the growth of AI across all of these applications, the technology is becoming even smarter and more sophisticated, every single day.”

Emblematic of this strategy, it was at CES that the company discussed its next gen AMD Instinct™ MI300 Series accelerators, which includes a product integrating their data center CPU and GPU processors. AMD EPYC™ CPUs already are present in a growing number of systems at the top of the TOP500 list of the world’s most powerful supercomputers, and AMD Instinct MI250X GPU accelerators (along with EPYC 7003 series CPUs) power Frontier, the world’s no. 1 system and first exascale-class supercomputer. They also drive LUMI, the world’s no. 3 system housed at the CSC supercomputing center in Finland, which is among the greenest supercomputers in the world and a leading platform for AI. Now, with the integration of AMD CPU-GPU technologies, the MI300 is poised to drive continued AMD leadership in HPC and AI performance.

AMD MI300 (source: AMD)

Within weeks of Su’s CES address, AMD Senior Director of Data Center AI and Compute Marketing     Nick Ni took up the theme in a keynote he delivered at the World AI Cannes Festival in France with Thomas Wolf, co-founder and chief science officer of Hugging Face, which develops tools for building applications using machine learning.

Entitled, “A Breakthrough in Pervasive AI,” Ni said that “because AI models continue to accelerate innovation, a one-size-fits-all hardware architecture is not practical.”

He cited an edge example: “To maximize the compute utilization, an ideal edge device contains both AI engines and adaptable hardware. This combination will offer the state-of-the-art AI horsepower while allowing DSA (domain specific architectures) hardware to be programmed to efficiently run the specific AI tasks. While AI innovation accelerates rapidly, AI hardware must adapt efficient DSAs to meet the challenging requirements.”

AMD, Ni said, “…is a unique company that has all the offerings from the PC, to the embedded to the data center CPUs to data center accelerators, all the way to GPUs.”

This page provides an overview of AMD processors, from servers to workstations, from laptops to desktops, for embedded and semi-custom implementations. And this page looks at AMD Instinct GPU accelerators. As Ni said, other chip vendors offer pieces along the AI deployment spectrum, but no one else delivers as comprehensive a portfolio.

What about porting AI applications to AMD compute environments? For several years, AMD has invested heavily in software, including ROCm™, AMD’s open software stack for GPU programming. ROCm spans several domains: general-purpose computing on graphics processing units, HPC and AI, and heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface, OpenCL.

In his keynote at Cannes, Ni cited the example of migrating AI models from NVIDIA GPUs to the AMD Alveo V70 accelerator card for AI inference. Ni said the process involves just a few lines of code change, as shown here:

With a far-reaching semiconductor product portfolio and an increasingly sophisticated software support environment, AMD is building out a pervasive AI infrastructure to enable the oncoming AI everywhere era.