Conventional Wisdom Watch: Matsuoka & Co. Take on 12 Myths of HPC

A group of HPC thinkers, including the estimable Satoshi Matsuoka of the RIKEN Center for Computational Science in Japan, have come together to challenge common lines of thought they say have become, to varying degrees, accepted wisdom in HPC.

In a paper entitled “Myths and Legends of High-Performance Computing” appearing this week on the Arvix site, Matsuoka and four colleagues (three from the RIKEN Center – see author list below) offer opinions and analysis on such issues as quantum replacing classical HPC, the zettascale timeline, disaggregated computing, domain-specific languages (DSLs) vs. Fortran and cloud subsuming HPC, among other topics.

“We believe (these myths and legends) represent the zeitgeist of the current era of massive change, driven by the end of many scaling laws, such as Dennard scaling and Moore’s law,” the authors said.

In this way they join the growing “end of” discussions in HPC. For example, as the industry moves through 3nm, 2nm, and 1.4nm chips – then what? Will accelerators displace CPUs altogether? What’s next after overburdened electrical I/O interconnects? How do we get more memory per core?

A theme of the paper is trade-offs, the notion that all supercomputer designs are a balancing act between performance, price and feasibility. It’s the lack of nuanced, trade-off thinking, the authors contend, that leads to HPC myths and legends.

“Simplistic arguments along the lines of ‘we need more of X’ seem to have a solid tradition in our community,” the authors argued. “For example, the HPC community spent the first decades to hunt more floating point computations per second.”

Satoshi Matsuoka

But now, with chips sitting idle waiting for data, faster memory has come into focus.

“The community nearly made a complete 360-degree turn,” the authors said, “with Haus (2021) saying ‘computation is free’ and Ivanov et al. (2021) showing ‘data movement is all you need’. Some even argue that this turn was taken too late due to the fixation on flop/s.”

The authors response? The discussion, they said, “should really be about the intricate relationship between the application requirements and the system capabilities in terms of balance, i.e., ratio between the different resources such as memory size/bandwidth and compute… These ratios usually shift with chip technology and architectural choices… I/O complexity analysis is a tool to deeply understand (these) trade-off(s). Once all trade-offs are understood, requirements models… could be used to fix trade-offs into designs.”

Here are the myths identified by the authors along with pertinent quotes from the paper:

Myth 1: Quantum Computing Will Take Over HPC!
“The whole IT industry is following the quantum trend and conceives quickly growing expectations. The actual development of quantum technologies, algorithms, and use cases is on a very different time-scale. Most practitioners would not expect quantum computers to outperform classical computers within the next decade. Yet, we have constantly been surprised by advances in device scaling as well as, more recently, artificial intelligence. Thus, the fear of missing out on getting rich is driving the industry to heavily invest in quantum technologies pushing the technology forward.”

Jens Domke

Myth 2: Everything Will Be Deep Learning!
“…there has been a plethora of papers (about) replacing traditional simulation methods, or whole computational kernels with data-driven models… Impressive results fire up expectations … There is no doubt that deep learning models can learn to approximate complex functions used in scientific simulations in a specific input domain. The issue is, as always, the tradeoffs: between speed on one hand, and accuracy on the other ….”


Myth 3: Extreme Specialization as Seen in Smartphones Will Push Supercomputers Beyond Moore’s Law!
“The success of GPUs, growing demands for lower power and highest performance, and the end of Moore’s law created a myth that future supercomputer architectures will be just like smartphones in that there will be multitudes of hardware customization per each facet of the entire workload. However, such a claim misses the point in the analogy, and entirely ignores multiple drawbacks of such an approach….”

Mohamed Wahib

Myth 4: Everything Will Run on Some Accelerator!
Will “… some superchip such as GPUs largely replace the CPUs, the latter be degraded to second class citizens? It is not as trivial as it may seem, as such statements are rather dogmatic and not based on candid analysis of the workloads. By proper analysis of the workloads, we may find that CPUs may continue to play a dominant role, with accelerator being an important but less dominant sidekick.”

Myth 5: Reconfigurable Hardware Will Give You 100X Speed-up!
“Whether FPGAs can replace or complement the mainstream GPUs in the HPC and data center market hinges on questions regarding the cost-to-performance ratio, an existing software ecosystem, and most importantly the productivity of programmers. Unfortunately, we see hurdles in all these areas, which the community and industry might be able to solve with enough time and money.”

Myth 6: We Will Soon Run at Zettascale!
“…our more realistic, yet optimistic, timeline for zetta is zettaop/s in 2032 at 50 MW, zettaflop/s in 2037 at 200 MW, and zettascale by 2038. Can Intel or anybody else pull it off before then? Only time will tell.”

Aleksandr Drozd

Myth 7: Next-Generation Systems Need More Memory per Core!
“Newly emerging optical off-chip connectivity…, as well as 3D integrated memory… shifts the balance again and may alleviate many of these aspects, at least at the scale of a single chip. It seems key to understand the malleability of applications, i.e., which resources can be traded for which other resources (e.g., memory capacity for computation bandwidth using recomputation or caching as techniques).”

Myth 8: Everything Will Be Disaggregated!
“To stop the waste of memory resources, the academic community is advancing on the Silicon Photonics front … and industry is pursuing scale-out technologies…, such as Compute Express Link (CXL), a cache-coherent interconnect for data centers. But a few players seem to push the idea over the edge with their plans to disaggregate everything… Generally, we see two remaining challenges for a broad adoption of Silicon Photonics and all-optical interconnects: low-cost manufacturing and optical switching.”

Torsten Hoefler

Myth 9: Applications Continue to Improve, Even on Stagnating Hardware!
“…in the so-called Post-Moore era, the ‘performance road’ forks three-ways…: (1) architectural innovations will attempt to close the performance gap, and an explosion of diverging architectures tailored for specific science domains will emerge, or (2) alternative materials and technologies (e.g., non-CMOS technologies) that allow the spirit of Moore’s law to continue for a foreseeable future, or (3) we abandon the von-Neumann paradigm altogether and move to a neuromorphic or quantum-like computer (which, in time, might or might not become practical…).”

Myth 10: Fortran Is Dead, Long Live the DSL!
“Are some parts of our community just too stubborn to follow the youngsters? Or are old languages not necessarily bad for the task? Indeed, Fortran is a very well designed language for its purpose of expressing mathematical programs at highest performance. It seems hard to replace it with C or other languages and outperform it or even achieve the same baseline.”

Myth 11: HPC Will Pivot to Low or Mixed Precision!
“Lowering … precision can save costs but may reduce accuracy of the results and, in the worst case, break the application (e.g., convergence). … Now that mixed precision is a de-facto standard in the AI domain, more hardware support is being implemented. So far there is no general clarity on the limits—how few bits can we get away with in different HPC areas.”

Myth 12: All HPC Will Be Subsumed by the Clouds!
“If the commercial cloud hyperscalers can work out the scale of economy in their own hardware manufacturing to the extent that it could build and operate large scale HPC infrastructures cheaper than on-prem supercomputers of any size, then the swing could totally happen towards full subsumption— although somewhat unlikely, this could compromise the ability to cover some of the traditional HPC workloads that do not meet main industrial needs, such as the requirement for dense 64 bit linear algebra capabilities.”

The paper’s authors:

Satoshi Matsuoka, Director, RIKEN Center for Computational Science

Jens Domke, Team Leader of RIKEN’s Supercomputing Performance Research Team

Mohamed Wahib, Team Leader of RIKEN’s High Performance Artificial Intelligence Systems Research

Aleksandr Drozd founder and CEO at Amigawa and RIKEN research scientist

Torsten Hoefler, Associate Professor of Computer Science ETH Zurich – Swiss Federal Institute of Technology