In this interview, Mark Papermaster, CTO and EVP, Technology and Engineering from AMD describes the company’s presence in the HPC space along with new trends in the industry. At a higher level, Mark also offers his views of the semiconductor industry in general as well as areas of innovation that AMD plans to cultivate. The discussion then turns to the exascale era of computing.
insideHPC: AMD has seen a remarkable resurgence in HPC in the past 18 months or so. What are the trends you are seeing in high performance computing?
Mark Papermaster: We are at a true inflection point in high performance computing. The compute and performance demand for traditional HPC workloads like oil and gas exploration, weather forecasting, simulation, modeling and other areas continues to grow rapidly. These workloads need more core engines and require those engines to be fed by high memory and I/O bandwidth. HPC is one of the foundational domains behind AMD’s resurgence. AMD EPYC™ server CPU products broke through prior industry limitations by providing 64 CPU cores in a socket with balanced memory and I/O capability to efficiently drive the most demanding applications. There is also a trend of emerging applications, including cloud applications, that need HPC capabilities. A great example is the area of self-reinforcement machine learning algorithms. This approach to AI is showing excellent gains in accuracy and demands HPC compute configurations with the most powerful CPUs and GPUs. Another example is the edge, where HPC analysis is now needed in certain use cases to run deep analytics close to the source of the data creation. These new applications challenge the industry to deliver much more computation capability with improved affordability. AMD has responded with product configurations and roadmaps to address these trends and requirements.
insideHPC: What kind of dynamics are you seeing in the Semiconductor industry that are driving your path forward?
Mark Papermaster: As an industry, we face a dichotomy as the historical semiconductor node improvements of Moore’s Law slow, while new compute-intensive applications require exponentially more capability. Moore’s Law is the doubling of the number of transistors on a chip about every two years through ever smaller circuitry, producing greater performance and energy efficiency. In the past, each generation of semiconductor technology could be relied upon to enable the next generation of computer chips to be faster and lower power at the same relative cost. However, the laws of physics cannot be fooled. We have reached a level where the miniaturization of transistors is now bumping against physical limits. Future semiconductor technology nodes will still bring significant miniaturization and lower power over the next decade, but the costs are much higher and the historical improvements in speed are not going to be achieved. This dynamic will drive more performance gains to be achieved through heterogeneous solutions, including CPU with GPU and other accelerators.
insideHPC: What innovation areas are you focused on for continuing AMD’s success?
Mark Papermaster: At AMD, we are driving a multi-pronged innovation approach. We will never lose sight of the need to drive our CPU and GPU roadmaps to deliver the best performance and efficiency gains. Equally, we are focused on a modular design approach to optimize our IP blocks to address the market segments we pursue and software enablement to ease the development of world-class solutions. The key innovation areas include a) microarchitectures in x86 CPU with the AMD “Zen” family and GPU with the AMD “Navi” family, b) a modular design approach using AMD Infinity Architecture, c) innovative packaging leveraging multi-chiplet approaches, silicon interposers that stack high bandwidth memory and future “X3D” that combines die stacking and chiplets, and d) an open software approach across our products that grow a robust ecosystem.
insideHPC: AMD will power not one, but two exascale supercomputers in the next couple of years. From your standpoint, what are the toughest challenges for delivering exascale performance to end users?
Mark Papermaster: The exascale era of computing is driven by two large use cases in the data center: machine learning and HPC. We went head-to-head with the competition and AMD was chosen to power the next two generations of what are predicted to be the world’s most powerful supercomputers – Frontier from Oak Ridge National Lab and El Capitan from Lawrence Livermore National Lab.
The innovations to achieve this level of computation are foundational and drive our leadership investments for the long term. Two of the toughest challenges for delivering exascale performance to end users are improving power-efficiency and enhancing programmer productivity. Overcoming these challenges requires very efficient CPUs and GPUs, along with software that effectively utilizes the underlying hardware and simplifies parallel programming.
Exascale computing will enable end users to reduce their time to solution and solve previously intractable problems. This requires efficient interactions between hardware, system software, and applications. To facilitate these interactions and boost productivity, AMD’s third generation Infinity Architecture will provide very high-bandwidth, coherent unified memory between AMD EPYC CPUs and Radeon™ Instinct™ GPUs. The Radeon Open Compute platform (ROCm) from AMD will efficiently use these underlying technologies to deliver exascale performance to end users of machine learning and HPC applications.
insideHPC: The demand for HPC will not stop at exascale. Given the state of Moore’s Law, how do you approach the problem of delivering what comes after exascale?
Mark Papermaster: The process technology, architecture and packaging improvements I mentioned above helped to drive the industry to exascale computing and still have a role to play. However, there will be no shortage of innovative approaches to continue the exceptional growth of computing capabilities. I will take a moment to discuss three of them.
One of the more promising technologies now being explored is integrated photonics. Computers today move electrons across metal wires, and with that comes resistance, leakage, latency, and heat. All of these can limit performance. Photonics, which transmits photons instead of electrons, offers the possibility of overcoming some of these obstacles. Computers incorporating photonics use light to transfer information and electronics to process it. Chip-scale integrated photonics would make the technology vastly more accessible and has the potential for large performance gains.
Quantum computing holds tremendous promise for certain use cases but likely will not be a replacement for general purpose digital computing. Instead, quantum computers are expected to work together with traditional computing technologies and serve as accelerators for specific computing challenges, such as computational chemistry and weather forecasting.
Also, on a future horizon is neuromorphic computing, which includes the production and use of artificial neural networks and other techniques that mimic how the brain performs its functions. These functions include making decisions, memorizing information, reasoning and deducing facts. Neuromorphic computing holds significant potential to reduce energy and improve performance for applications ranging from anomaly detection to medical diagnosis.
insideHPC: You are very generous with your time, participating in a host of events like SC19 and the Dell HPC Community meetings. Why is it so important for you to engage with HPC users?
Mark Papermaster: As the AMD CTO, I seek to engage directly with the most demanding compute users in the industry. HPC is the trendsetting sector where users have a deep knowledge of their workloads, hardware and software bottlenecks, and the architectures that can best drive computation to the highest level. I truly value the feedback of the HPC community, hearing the needs of experts, and sharing our AMD high compute vision.
Personally, I have never been more excited than now about the opportunities directly in front of us in HPC. There are so many advances that will benefit how the world does business, finds new medical treatments and cures, deals with climate change and communicates and interacts with one another. This is a truly disruptive and exciting time as we enter the exascale era of computing. New possibilities will emerge that we have not even thought of yet.
integrated photonics…
Quantum computing …
future “X3D” …
neuromorphic computing …
I don’t recall seeing any big contributions coming out of AMD on these, but Intel sometimes pops up with news articles on these from their research labs.
You have to give AMD credit for getting the most out of the infinity fabric, but I get the impression the sprawling chiplet design is approaching its limits when I see the future X3D comments.
Norrod made similar comments in 2019 at the Rice Oil and Gas HPC conference, along with pessimistic projections concerning clock rates, which was quoted in the tomshardware article on March 17, 2019.
“… the two simple levers of density and frequency improvements have reached a diminishing point of returns. In some cases, frequency is even regressing.”