This series is about the men and women who are changing the way the HPC community develops, deploys, and operates the supercomputers we build on behalf of scientists and engineers around the world.
As part of the team that explored the Beowulf computing concept, Thomas Sterling has already revolutionized the high performance computing community once. But the author of six books and a raft of journal and conference publications isn’t ready to leave the hard work of change to someone else. His sights are set now on changing the way we use and build the supercomputers of today and the exascale monsters of tomorrow.
You have such a rich history in this community and have been involved in so many milestone activities — what would you call out as one or two of the high points of your career — some of the things of which you are most proud?
Thomas Sterling: In the broadest sense, I am most proud — or perhaps I should just say “grateful” — for being allowed to continuously stay in and contribute to the field of HPC for three decades since my graduate work at MIT. But if I had to pick two high points they would be Beowulf and ParalleX; the former more than a decade ago and the latter of current active engagement.
Beowulf was an experiment that explored the potential of low cost hardware and software in ensembles to perform real world science computation at unprecedented performance to cost. It was one of several projects in what has become known as “clusters” but I think had significant impact on the community. Today on the Top-500 list 1) the most widely used processor architecture is X86, 2) the most widely used network is Ethernet, 3) the most widely used OS is Linux, 4) the most widely used system configuration is the commodity cluster, and 5) the most widely used programming model is distributed memory message passing. Beowulf was the first research project to implement and explore this synthesis of elements, many in their inchoate phase. We recognized that community education was as critical as technology in accomplishing this paradigm shift and therefore through a series of tutorials and books we provided easy access to this approach. Admittedly I had no idea that it would dominate HPC. In some sense I was just lucky to be at the right place at the right time with a budget, a need, and a good idea.
ParalleX reflects the next HPC paradigm shift or more accurately is an exploratory project of key system elements from semantics through functionality to structures and mechanisms that in synthesis is catalyzing the 6th Phase of HPC.
insideHPC: As the co-author of six books, (an accomplishment in its own right), would you talk a little bit about your commitment to education: how we can both attract the next generation of HPC professionals into the community, and provide them with the experience-based training that they will need to be successful?
Sterling: As a faculty member in a computer science department at a major state university, I have become keenly aware of the challenge of education to attract and train the next generation to the field of HPC.
I teach an introduction course to HPC at the first year graduate and senior undergraduate level (LSU CSC-7600). This course is a bit different from your conventional parallel programming course because it provides strong cross cutting themes of parallelism, performance, and system structures that determine the effectiveness of scalability on real-world systems. The students learn how to program in three different modalities (capability, cooperative, capacity) and corresponding programming interfaces (OpenMP, MPI, Condor) along with related system architectures (SMP, clusters, farms/clouds).
But such courses are not readily available at all of the thousands of colleges across the country. I hate the fact that a young man or women is robbed of the choice of personal goals and opportunities simply due to their demographics and socio-economic circumstances. To counter this truly unfortunate aspect of the American condition, I have been working, perhaps inadequately, in the realm of distance learning, exporting the course in real-time in high definition to a few remote campuses. My external partner in this has been Professor Amy Apon at University of Arkansas and my colleagues at LSU have been Chirag Dekate and Hartmut Kaiser (among others), in combination with a staff of technology types led by Ravi Parachuri. Petre Holub at Masaryk University in the Czech Republic was the force behind the high definition stuff strongly driven by Ed Seidel and Steve Beck, directors at the Center for Computation and Technology at LSU.
My next book will be my first textbook for this course; it’s a lot harder to write a textbook than my previous efforts. Jose Munoz has been an advocate of this work, and we hope to expand this to other communities. We have run a “Beowulf Bootcamp” for two summers that involved high school students to get them excited about going to college, and therefore (hopefully) to finish high school. With a dropout rate of one third such as that in Louisiana, we need to find ways to motivate our kids to aspire and excel. We should do an entire interview just on this topic. It’s so important and so inadequately addressed at this point.
insideHPC: Anyone researching your background can’t help but notice the long list of volunteer activities through which you have selflessly served this community. Why do you do that? I know from personal experience, there is seldom any real recognition for this type of service — the reward has to be internal. What motivates you to be so active in supporting the HPC community through these activities?
Sterling: Community engagement is critical to the success of the HPC field. In a sense, any discipline that is system oriented requires a community-system approach as it must engage the diversity of talents, expertise, and resources reflected by the many sub disciplines defining the diversity of components, technologies, and methodologies integrated within the single complex system. Therefore, it is out of necessity that one participate in, and sometimes lead, community-led forums that focus on the many enabling leading-edge issues.
When I and my colleagues conducted a number of meetings and tutorials on early Beowulf cluster implementation and application, we were doing this out of a necessity for technology transfer. When I and colleagues conducted a number of workshops related to Petaflops computing, then orders of magnitude away from contemporary capability, this was genuine pursuit of knowledge, perspectives, and concepts. Today, there is a dramatic surge in the domain of Exascale computing led by DOE and DARPA with strong NSF participation as well. These are important exploratory meetings both devising and guiding future work towards this challenging goal less than a decade away. I have been fortunate to be included in these initiatives.
Finally, I am honored by the number of presentations at conferences and workshops I am invited to give, and it is a pleasure to serve the community to the best of my ability in this way. In particular, I have enjoyed providing a presentation every year at the ISC in Germany on summarizing the accomplishments in the field of HPC during the previous year. This June will mark the seventh such talk, and I am grateful to Hans Meuer and the other program organizers for this opportunity. I guess, to be honest, it’s part of the fun.
insideHPC: Are there any people who have been an influence on you during your years in this community?
Sterling: Nothing is more humbling than reflecting on all of the colleagues who have contributed to one’s own accomplishments and in my case there have, and continue to be, many. To note any would be to fail to identify so many others. But with that acknowledgement of inadequacy, allow me to recognize a few who have had appreciable impact in chronological order:
Bert Halstead was my doctoral thesis advisor at MIT, and it was from him that I learned the critical importance of deep-thinking in the intellectual arenas of abstraction and models of computation, not just as a mental exercise but as important tools to innovation.
Jim Fischer of NASA Goddard Space Flight Center taught me the importance of enlightened but responsible management as a key element of collective achievement in HPC system advancement to serve science applications. It was he who empowered the Beowulf Project in the face of strong resistance, and prevailed.
Paul Messina, formerly of Caltech and currently at Argonne National Laboratory, has been my mentor in working within the HPC community, supporting it and being supported by it, and complementing individual accomplishment through the leveraging of group engagement. He led the CCSF initiative that in 1991 deployed the Intel Touchstone Delta at Caltech, which was the fastest open HPC system in the world at the time and the prototype of the successful line of Intel Touchstone Paragon computers of the 1990s. He and I co-authored my first two books together.
Larry Bergman of the Jet Propulsion Laboratory has been among my most important collaborators over almost a decade of accomplishment, performing at different times as my boss, my program manager, and my research partner. Without Larry, a decade of accomplishment in my professional career would most likely not have occurred. It was from him that I learned my limitations as he seamlessly complemented my strengths with his own to form an effective working partnership driving pursuit of advancement in HPC.
Ed Seidel created the new LSU Center for Computation and Technology that embodied a unique melding of resources at the state and federal level enabling aggressive and innovative HPC research both in end computational science and systems technology (software and hardware). It was within the context of this environment and the opportunities that it afforded that I have been able to conduct my most recent explorations and endeavors. At LSU CCT I am fortunate to work with a small group of research scientists who are making possible these researches of the new horizon of HPC: Maciej Brodowicz, Hartmut Kaiser, and Steve Brandt.
As pivotal as these people have been to me at different stages of my career, in many cases providing real role models as well, there is a group of colleagues who have and are both contributing to the field of HPC and to my own work as well. Since they are all very well known to your readership I identify them in no particular order without explanation: Bill Gropp, Bob Lucas, Dan Reed, Guang Gao, John Salmon, Al Geist, Jack Dongarra, Bill Carlson, Kathy Yelick, Horst Simon, Almadena Chtchelkanova, Thomas Zacharia, Burton Smith, Marc Snir, Pete Beckman, Hans Meuer, Jose Munoz, Peter Kogge, Fred Johnson, George Cotter, Bill Dally, Rusty Lusk, Bill Harrod, and Paul Saylor.
Oh, yes, and of course there was Don. What can I say?; without him you would not be conducting this interview in all likelihood. It’s one of those strange things, a chance meeting (at MIT, he was a freshman and I a finishing doctoral student) which could easily never have occurred and yet one’s life changes. Don Becker (now CTO at Penguin) and I collaborated on a number of projects but it was he who developed the first Beowulf systems with a group of young highly motivated implementers to realize my system concept and architectural strategy.
insideHPC: What “non-HPC” hobbies or activities do you have? If you ever really have time off — how do you spend it?
Sterling: There is very little time for extra-professional pursuits but I do, when time permits, engage in three activities beyond HPC:
Sailing is the only activity that I can get involved in during which I truly forget about work. Of course I achieve this also during very scary airplane landings in thunderstorms, but this is not by choice so it doesn’t count as a hobby.
The study of history, in particular the 3rd Millennium BCE which is the late Bronze Age. I am fascinated with how small groups of people catalyzed in to large aggregations of what we would recognize as civilization enabled or driven by technological advances and ad hoc experiments in political science.
Got to love it; the brain fascinates me and for no useful purpose over the last decade I have found myself pursuing knowledge related to brain structure, function, and emergent behavior. Thinking about thinking, nature’s own recursion. I suppose Cognitive Science thinks about the process of thinking about thinking. But I’m not there yet.
insideHPC: Approximately how many conferences do you attend each year? What would you say is your percentage of travel?
Sterling: While I don’t travel anywhere nearly as much as Jack Dongarra, I average about two and a half trips per month although peak travel can reach four in any given month. I limit the number of general conferences to four or five a year but attend half a dozen focused workshops a year which I find far more useful and productive. Of course then there are the plethora of program and projects meetings. I get much of my work done in the Admiral’s Club at DFW.
insideHPC: How do you keep up with what’s going on in the community, and what do you use as your own “HPC Crystal Ball?”
Sterling: Your electronic publication, and that of your competition, proves very useful in keeping up with the day by day incremental advances and offerings of the industrial community with some valuable information on academic near-term accomplishments as well.
At this time of phase change in the field, the rate of advancement is too fast to be adequately represented by conventional professional society journals. These are valuable for archiving, but not for timely communication. I find that direct contact with contributing scientists and institutions both at organized forums like workshops and through unstructured side-bar private communication. I am amused by the conventionality of technical program committees of even small workshops on focused topics and their frequent fear of including new work-in-progress research.
My “HPC Crystal Ball” is more of a lens focused in deep space at a narrow part of the sky rather than the total space. I exploit blinders to narrow my scope of interest. I could, of course, completely miss a major important development. But I use foundational challenges of the logic and physics of the space of concern to inform about future directions of the field. It doesn’t always work but it has provided a unique viewpoint.
insideHPC: There are people in our community that are motivated by the science and discovery that we enable others to make, and people that are motivated by the science and engineering of HPC itself. Where do you fall on that spectrum?
Sterling: Like many, the answer for me is “both” but in selective areas in either case.
On the end science and engineering side, two major classes of problems are of interest to me. The first are two examples of classical supercomputing that I feel are essential to the advancement of civilization. These are the development of controlled fusion and the control at the atomic level of molecular dynamics. In the first case, we are challenged by the sheer scale of computation required at Exascale and beyond. In the second case, we are challenged by the need for dramatic improvements in strong scaling.
The second class of problems that absolutely fascinates me is the broad family of dynamic directed graphs as applied to knowledge management including but not limited to machine intelligence. Of course I am strongly focused on your latter domain of the science and engineering of HPC itself. This is particularly the case as the leading edge of the community are now considering revolutionary hardware and software techniques to extend delivered performance in an era of flatlined clock rates and processor core design complexity. This is a very exciting time to be engaged in HPC system research.
insideHPC: What do you see as the most exciting possibility of what we can hope to accomplish over the next 5-10 years through the application of HPC.
Sterling: As excited as I am about the computers themselves, their true value lies in their role as the third pillar of science (complementing experimental observation and theoretical modeling). More than ever, civilization must rely on the strength of HPC in devising new methods to addressing major challenges of climate, energy, medicine, design search-space optimization, and national security applications.
But my real interest lies in a very different area of application: symbolic computing. I believe we are likely to encounter a renaissance in intelligent computing, not seen since the 1980s, because the need for intelligent systems is growing for knowledge management, data mining, real-time decision making, declarative human interfaces, robotics, target recognition, and many other problems that need to directly manipulate abstractions and their inter-relations rather than raw data. I am particularly intrigued by the new opportunities afforded for self-aware machine intelligence by the Petascale computing systems coming on line with hundreds of Terabytes of main memory and their concomitant memory bandwidth which is critical for effective symbolic computing. I believe that by the middle of the next decade (yes, 15 years from now) symbolic applications will compete with or exceed the demands for cycles of numeric intensive applications.
HAL, are you listening?
insideHPC: What do you see as the single biggest challenge we face over the next 5-10 years?
Sterling: Viva la Revolution! Or to paraphrase a worn-out expression: “It’s the execution model, stupid!”
We are at the leading edge of a phase change in HPC; the sixth by my count over a period of as many decades. As previously suggested, the last two decades have been dominated by the communicating sequential processes (CSP) model which has served well for both MPPs and commodity clusters. But with the forced reliance on multi/many core and the flirtation with GPU accelerators, the model is stretching passed the yield point. A new model of computation will become essential before the end of this decade when current technology trends will demand billion-way parallelism, latency hiding for tens of thousands of cycles, global address space that are not statically nailed to specific physical hardware, the ability to migrate flow control as easily as data, and increase of programmer productivity (it just shouldn’t be this hard).
For me the biggest challenge we face over the next 5-10 years is: What is the new model of computation for HPC to replace CSP? And given that answer, whatever that may prove to be (yes, I have my suspicions), how will such a model’s set of intrinsic governing principles influence the co-design of the new programming models (sorry, new languages guys, there is no way around it), runtime and operating systems, and most exciting of all, the new core architectures. HPC system development is going to be fun again.
insideHPC: Any final thoughts you would like to share with our readers?
Sterling: HPC is multi-faceted. I spend too much time out on the fringe pushing the performance limits, but have come to appreciate the challenge of strong scaling required to shorten the execution time of problems that already take far too long (many weeks and months) but cannot effectively employ anywhere near the available processing resources. This isn’t some future Exaflops problem, but real problems in AMR and molecular dynamics trying to be worked today. To serve these effectively will require a real change in how we build hardware core architectures because it is the inefficiencies in overhead, latency, and contention as well as poor use of available parallelism that is inhibiting better scaling for these and other current applications.
From this you may infer that “no” all the focus on Exascale is not the right thing, but many of the challenges to Exaflops performance of the future are the same as some of the challenges to strong scaling of problems in the present. The current generation of Petaflops machines, with the possible exception of Roadrunner, is really the end of the classical static CSP era. Core architectures of the future will have to incorporate hardware mechanisms that recognize the shared context of thousands or millions of like-cores rather than passing off all such interchange to the I/O subsystem which is not optimized for inter-thread cross-system interoperability.
Making a machine easier to use requires a machine that is easier to use and that is not what we have been building over the last two decades. Too many people, including those in sponsoring agencies, wish the problem to be tractable through uniquely software means. It is not a software problem, at least not exclusively. It is a multi-level system problem; yes, including software but only in conjunction with an enabling hardware solution as well. The new DARPA UHPC program recognizes this truth and is pushing for a holistic perspective towards innovative solutions. A consequence of that is that productivity will be enhanced and users will find systems easier targets to employ.
I will make an outrageous prediction: Exaflops systems in 2020 will be easier to use than Petaflops machines are in 2010.