This year the city of New Orleans will be hosting the 23rd annual Supercomputing Conference, and it’s a great match. New Orleans is a city in the midst of reinventing itself, even as it continues to celebrate the traditions that have made it famous. And today HPC finds itself inventing the present in the era of trans-petaflops computing even as we are running hard toward an exascale future.
Reinvention and change have always been at the core of HPC. As new technologies enter the computing landscape, our community adopts and adapts what it finds on the quest to expand the capabilities of the tools of discovery and exploration that we provide to enable the “makers” — the scientists, engineers, and thinkers — to make the world a better, safe, healthier place for all of us.
There is a periodicity to the change in HPC: ideas fade from relevance as new technologies come on the scene only to find themselves repurposed again a decade later in response to opportunities presented by even newer ideas. Today the idea of heterogeneous computing is once again shaping the present and future of HPC, and SC10 is focusing on heterogeneity as one of its three thrust areas (the website has more information on the thrust areas at SC10).
Vive la diffèrence!
Very generally speaking, a heterogeneous computer is a system that uses different types of computational units to accomplish the work of the applications that use the computer. At a fine level of granularity, even a single microprocessor is a heterogeneous computing system, as it comprised of integer and floating point units, instruction decoders, and other components that each perform a very specific type of task on the chip. But practitioners today group all of those functions together in a single conceptual computational unit. So the Beowulf cluster of the early 2000s, with its hundreds of identical commodity computers interconnected by a communications network, would be considered a homogeneous cluster. All of the computations are performed using the same kind of computational engine: an Intel or AMD chip, for example.
But if you add in a new kind of computational engine – a field-programmable gate array (FPGA), Cell processor, or graphics processing unit (GPU) for example – then the cluster becomes a heterogeneous computer. In this configuration the added computational units offer the potential of significant performance improvements for several calculations of interest. In other words, they “accelerate” certain types of computation, and so are often referred to as accelerators.
The idea of using accelerators for specific kinds of computations goes back a long way in HPC. Scientific computing systems in the 1980s made by Floating Point Systems and others, for example, had attached arrays of processors that specialized in the floating point calculations that central to scientific and technical computing. Acceleration in the HPC landscape today is diverse, with offerings based on FPGAs (often with extensive software support, as in the Convey Computing approach), the Cell Broadband Engine developed by IBM and others and, most commonly, GPUs made by NVIDIA and AMD/ATI.
Here today, here tomorrow
By far the most common accelerator used in HPC today is the GPU, adapted from its origins in specialized graphics processing to support general scientific and engineering calculations. Because of their high degree of specialization to a specific subset of intense numerical calculations, GPUs and accelerators in general can offer a dramatically improved level of computation for a fixed energy budget. For this reason many experts believe that the power-constrained exascale systems targeted for the end of this decade will rely heavily on specialized computational units, moving the idea of heterogeneity from an “add-on” role to center stage in large-scale system design.
Medium-sized deployments of GPU-accelerated HPC systems are showing up today, with multi-petaflops behemoths in support of national extreme scale computers expected to enter service over the next six to twelve months.
Heterogeneous Computing at SC10
This is the perfect time for SC10 to turn its focus to the renewed concept of heterogeneous computing and acceleration, and those who want to learn more will find a rich selection of papers, talks, and learning opportunities in New Orleans this year.
Teach a man to fish…
SC10 starts off the week with a significant focus on heterogeneous computing. Sunday and Monday offer tutorials on CUDA, OpenCL (an emerging standard that promises to bridge the gap in language semantics for managing applications targeted at both the CPUs and GPUs in a heterogeneous computer). Attendees may also want to dive in with a deep focus on one area of heterogeneous computing through the 4th International Workshop on High-Performance Reconfigurable Computing Technology & Applications (HPRCTA’10), held concurrently with SC10.
Panels, papers, and prizes
The rest of the technical program offers an amazing variety of papers, talks, and sessions that all touch on topics central to heterogeneous computing today and tomorrow. It is not possible to cover them all without replicating a substantial portion of the program guide (which you can find here), but here are just a few highlights that may be of special interest.
There are two Masterworks sessions that should be especially interesting to those with their eyes and activities turned toward an exascale future. Steve Wallach will be giving a talk Wednesday morning about the complex issues that software designers will have to address in putting exascale computers to use in solving problems of practical interest. Later that morning, Wen-mei W. Hwu will focus in on higher-level programming models for the heterogeneous computing systems expected to be at the heart of exascale systems at the end of the decade.
If you like your technical information to come with the possibility of a trophy, you’ll want to drop in and give a listen to the Gordon Bell Prize finalist lecture, “Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures.” This talk is especially relevant to those wanting to dive in to practical heterogeneous computations based on GPUs today as it covers results on hybrid GPU-CPU machines (including the new NVIDIA Fermi architecture).
The papers sessions throughout the week include many items of potential interest. For example, “Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations” should also appeal to those working on large GPU-accelerated computations today. The authors will present a parametric study of the value of various approaches to dividing computational work between the CPU and GPU computing elements.
The panels are always lively at SC, and two panels at this year’s event promise entertainment and insight as the industry’s brightest voices share their insight about large scale heterogeneous computing. “Toward Exascale Computing with Heterogeneous Architectures” brings industry veterans like Jeffrey S. Vetter, Satoshi Matsuoka, John Shalf, and Steve Wallach to the stage to talk about the significant challenges — from low programmer productivity and lack of portability, to the absence of integrated tools and sensitive performance stability — in the context of future exascale systems. “On The Three P’s of Heterogeneous Computing: Performance, Power and Programmability” brings the insights of well-known HPC community leaders Wu Feng, Bill Dally, Tim Mattson, and others to bear on three of the most challenging aspects of computing today and tomorrow.
A great week to get caught up
The thrust areas at SC10 offer a unique opportunity to dig deep into critical issues driving the supercomputing community. Tune into events in the Heterogeneous Computing thrust area throughout the week to be sure you are well-positioned to understand the issues of today, and to plan well for the machines of tomorrow.