Supercomputers present many challenges in hardware and software environments as the march to exascale gets closer and closer. This will inevitably involve greater complexity and the need for ultra-specialized training. We caught up with Paul Messina, the Argonne Leadership Computing Facility’s Director of Science, to talk about the institution’s extreme scale training program.
insideHPC: Paul, please tell me a little about the Argonne Training Program.
Paul Messina: The Argonne Training Program on Extreme-Scale Computing (ATPESC) is an intense, two-week program that covers most topics and skills that are needed to conduct computational science and engineering research on today’s and tomorrow’s high-end computers. Systems like Argonne’s Mira, an IBM Blue Gene/Q system with nearly a million cores, can enable breakthroughs in science, but to use them productively requires expertise in computer architectures, parallel programming, mathematical software, data management and analysis, performance analysis tools, software engineering, and so on. Our training program exposes the participants to all those topics and provides hands-on exercises for experimenting with most of them.
insideHPC: Who are your attendees? What sectors do they come from?
Paul Messina: The attendees are doctoral students, postdocs, and computational scientists who have used at least one HPC system for a reasonably complex application and are engaged in or planning to conduct computational science and engineering research on large-scale computers. Their research interests span the disciplines that benefit from HPC, such as physics, chemistry, materials science, computational fluid dynamics, climate modeling, and biology.
insideHPC: How do they ultimately benefit from such a gathering?
Paul Messina: The participants benefit in at least three ways. They obtain in-depth knowledge on several of the topics, especially programming techniques and numerical algorithms that are effective in leading edge HPC systems. They learn the existence of available software and techniques for all the topics, so that when their research requires a certain skill or software tools, they know where to look to find it. And through exposure to the trends in HPC architectures and software they will be able to follow approaches that are likely to provide performance portability over the next decade or more.
insideHPC: Is there a particular focus at the core of the program?
Paul Messina: Avoiding painting oneself into a software corner is a focus of the program, or put less casually, performance portability. It is easy to write software targeted on the platform one is using and exploiting its special features at all levels of the code. However, every three to five years there are substantial changes in HPC architectures. By designing the software architecture such that the use of features specific to a given platform are at a low level and easily identifiable, one greatly reduces the effort to transition to future architectures. Using software components written by world experts who track the evolution of supercomputers and optimize their packages to the new systems also reduces the effort to achieve performance portability.
insideHPC: Is there a subject presented here that gets you especially excited?
Paul Messina: The mathematical algorithms and software – I include mesh generation in this category – elements of the curriculum are particularly exciting to me because there have been major advances in parallel algorithms, often orders of magnitude faster than prior state of the art and – I confess – because early in my career that was one of my research interests. We are fortunate to feature world experts as lecturers on these topics, as indeed is the case for all the topics our program covers.
insideHPC: I noticed there is a Big Data angle on the agenda. How do you see Big Data fitting in with extreme scale computing especially in regards to the future?
Paul Messina: Big Data aspects will be intertwined with extreme-scale computing in several ways. The ever-growing volumes of data that physical experiments are producing and the complexity of the phenomena studied are such that extreme-scale computers are required for their analysis. Massive sensor fields are gathering data into large data bases and in some cases a near real-time analysis is needed, e.g., for disaster response. In addition, big computer simulations produce large outputs, even today sometimes measured in petabytes, and archives of those results are valuable for future analyses and validations. In short, big compute and big data often go together.