Bill Dally, Chief Scientist at NVIDIA, published an interesting commentary today in Forbes Magazine [online edition]. Dally analyzes the past, present and future of Gordon Moore’s famous series of theories. The ubiquitous set of theories, often referred to wholly as “Moore’s Law”, predicted that integrated circuits would double in transistor density every 18 months [revised from 12 months]. Over the years, people have directly equated transistor count to performance. Moore also predicted that the relative energy consumption per unit would decrease over time. Ultimately, the combined theories predicted more complex processors at linear power utilization. Its quite obvious that this ratio has ended.
[Moore’s Law] predicted the number of transistors on an integrated circuit would double each year (later revised to doubling every 18 months). This prediction laid the groundwork for another prediction: that doubling the number of transistors would also double the performance of CPUs every 18 months. [Moore] also projected that the amount of energy consumed by each unit of computing would decrease as the number of transistors increased. This enabled computing performance to scale up while the electrical power consumed remained constant. This power scaling, in addition to transistor scaling, is needed to scale CPU performance. But in a development that’s been largely overlooked, this power scaling has ended. And as a result, the CPU scaling predicted by Moore’s Law is now dead. CPU performance no longer doubles every 18 months,” said Bill Dally
Dally goes on to describe several key enabling HPC application areas to elicit his point. Namely, weather forecasting, seismic processing and medical informatics. Much of the recent HPC development in each of these areas has been focused on scaling applications out via parallelism in the workloads. Parallelism rather than clock frequency or core complexity. Dally argues that we, as an industry, need to begin adopting more natively parallel architectures, as opposed to more loosely coupled traditional cores.
Going forward, the critical need is to build energy-efficient parallel computers, sometimes called throughput computers, in which many processing cores, each optimized for efficiency, not serial speed, work together on the solution of a problem. A fundamental advantage of parallel computers is that they efficiently turn more transistors into more performance. Doubling the number of processors causes many programs to go twice as fast. In contrast, doubling the number of transistors in a serial CPU results in a very modest increase in performance–at a tremendous expense in energy. [Bill Dally]
To continue scaling computer performance, it is essential that we build parallel machines using cores optimized for energy efficiency, not serial performance. Building a parallel computer by connecting two to 12 conventional CPUs optimized for serial performance, an approach often called multi-core, will not work. This approach is analogous to trying to build an airplane by putting wings on a train. Conventional serial CPUs are simply too heavy (consume too much energy per instruction) to fly on parallel programs and to continue historic scaling of performance.
Bill manages to slip in a mention for GPU technology [gotta eat, right?], but I still agree with his fundamental argument. However, the road ahead is rocky and uncertain. The industry as a whole has migrated away from small numbers of parallel processors [such as vectors] to a large number of sequential processors with a smattering of SIMD. The fundamental application architecture for the former is wildly different.
Enter the “Exascale Era”, lumbering down the red carpet. We’re poised to enter an era of hybrid core machines at super scale in order to meet the architecture needs of the most demanding of applications. [play the ominous pipe organ music now]. Sounds like HPC is going to learn some new tricks.
If you’re interested, read Bill’s full commentary here.