At insideHPC, we are pleased to announce that David Bader is our latest Rock Star of HPC.
While HPC tends to focus on compute-intensive problems, Big Data challenges require novel architectures for data-intensive computing. My group has been the first to parallelize and implement large-scale graph theoretic algorithms, which are quite a challenge because of the irregular memory accesses, little computation to overlap with these memory references, and fine grain synchronization. In the past several years, our research has enabled social scientists to analyze some of the largest social networks, detecting communities, finding the proverbial “needle in the haystack”, and “connecting the dots” by identifying central actors hidden in these networks. As you know, data and social media are now torrential streams of information that may provide valuable information to make decisions related to business intelligence, market analysis, and social trends.
Our Rock Stars of HPC gallery is growing as we look to a new generation of heterogeneous computing. And when the opportunity came to us to name our first European Rock Star of HPC, one name kept coming up: Thomas Schulthess:
Thomas Schulthess is the director of the Swiss National Supercomputing Centre (CSCS) at Manno. He studied physics and earned his Ph.D. degree at ETH Zurich. As CSCS director, he will also be professor of computational physics at ETH.. He worked for twelve years at the Oak Ridge National Laboratory (ORNL) in Tennessee, a leading supercomputing and research in the US. Since 2002, he led “Computational Materials Science Group” with 30 co-workers.
Thomas Schulthess studied physics at ETH Zurich and earned his doctorate in 1994 with a thesis on metal alloys based on experimental data and supercomputing simulations. He subsequently continued his research activity in the US and published around seventy research papers in the best journals of his field. His present research interests are in the focused on the magnetic properties of metallic nano-particles (nano-magnetism). Using high-performance computing, he is studying the magnetic structures of metal alloys. Of particular interest are his studies on the giant magnetoresistance. He is also a two-time winner of the Gordon Bell Award.
insideHPC: You were schooled as a physicist. What got you interested in high performance computing?
Thomas Schulthess: Physics being the mother of modern science, it is not at all surprising that many researchers in this field are interested in high performance computing – I am no exception. In my particular domain, condensed matter physics and material science, we have a canonical model (the many-body Schrödinger equation) that suffers from the curse of dimensionality. We are therefore constantly looking for better algorithms and more powerful computers to solve the particular problems we are investigating. This is how I got interested in ever more powerful supercomputers, and we had a wave of machines developed at ORNL that helped us tremendously. But when you look around, haven’t most serious players in HPC been trained as physicists? One could even argue that physicists involved in the Manhattan Project started HPC.
insideHPC: You have been involved in so many milestone HPC activities in this community – what would you call out as one or two of
the high points of your career – some of the things of which you are most proud?
Thomas Schulthess: The end-station for computational nanoscience we developed at the Center for Nanophase Materials Sciences (CNMS) at ORNL. We invested heavily in application and algorithm development, and now we have some of the best performing codes on petascale systems that are productive research tools in the user program of the nanocenter. Others have adopted the concept, e.g. the simulation labs in Jülich. In Switzerland we are developing version 2 of this concept, where we are pushing the HPC application development out into the research groups and communities that develop the models and application codes. The response from the application community that is now taking charge in 12 projects makes me confident about the sustained use of supercomputers as scientific instruments.
insideHPC: As Director of the Swiss National Supercomputing Centre you must have extensive administrative responsibilities. Do you still write code?
Thomas Schulthess: The responsibilities are of course much higher than in my previous job, but I have very competent staff to manage operations and I work for an institution with a lean administration that entrusts researchers with the leadership of projects. This means I have to find time to remain active in research and train graduate students. I am expected to develop the user community and the supercomputing strategy that meets their research needs. You have to be an active researcher to be credible for this job, that’s just the way science works in Switzerland. I don’t write big codes myself anymore, but I still lead teams who do – such codes have to be implemented by professionals who are fully committed to the job.
insideHPC: What are your thoughts on how we can attract the next generation of HPC professionals into the community – and provide them with the experience-based training that they will need to be successful?
Thomas Schulthess: We have to focus on the science and engineering problems we solve, discoveries we facilitate, and technologies created with HPC. We have to push productive HPC and maintain a high standard. This will make HPC interesting and attract bright young people to the field. At the same time we have to introduce HPC training into the computational science education at universities. HPC must become part of undergraduate and graduate curricula, rather than being limited to training courses given by computer centers when researchers need access to systems. Creating highly efficient and scalable simulations requires considering HPC from the very beginning of the thought process. This has to be reflected in education.
insideHPC: You were recently quoted as saying: ”Given the remarkable interest in GPU technology from the Swiss computational
science community, it is essential that CSCS adopt this technology into its high-end production systems soon.” Why is it essential for an institution like yours to adopt GPUs in a big way?
Thomas Schulthess: Application developers in Switzerland and elsewhere in Europe are rapidly adopting this technology. Supercomputing has to respond to this trend! Since CSCS has established a record in early adoption of new technologies in high-end computing systems, there are high expectations for us to look at GPU technology. At the same time, it is clear that we will only introduce GPU technology into our main production line of systems, if it can be used productively at scale. This is not yet the case, but I’m quite certain it will happen within a year or two.
insideHPC: What is your favorite way to spend time when you’re not working?
Thomas Schulthess: I spend all my time away from work with my family. We have two growing teenagers that are harder and harder to keep up with. I’m an outdoor person, I love skiing, hiking, sailing and more.
insideHPC: What motivates you? What is your passion?
Thomas Schulthess: Science! I am a physicist, a researcher, like many of my colleagues I love to create machines or systems that allow us to do new experiments and look at nature in ways we have not been able to before. In recent years I had a lot of fun collaborating with peers from other domains to push the envelope with simulation based science in areas outside my own.
insideHPC: You received the Gordon Bell Award for for attaining the fastest performance ever in a scientific supercomputing simulation of superconductors. Was this the same record-breaking code that recently scaled to 1.84 Petaflops on the Tianhe-1A system in China?
Thomas Schulthess: No, the runs on Tianhe-1A were done with a classical molecular dynamics code. DCA++, with which we set the 1.9 Petaflops record in 2009, uses totally different quantum Monte Carlo algorithms today. The efficiency and scalability of the new code is probably higher today, but most importantly, the new algorithms allow us to reach a level of precision not possible with the implementation used in 2008/9 and time to solution has been improved dramatically. In my field, algorithms are still improving faster than computer architectures do. This is why we have to introduce the knowledge about architecture into the communities that develop algorithms.
In this special feature written by Mike Bernhardt from The Exascale Report, we honor Dona Crawford, the first woman to grace the ranks of our Rock Stars of HPC.
I first met Dona Crawford at SC’95 in San Diego when she was the conference Deputy Program Chair and the HPC Challenge Co-chair. Two years later, Dona was one of the most visible leaders in the HPC community as the General Chair for SC’97 in San Jose.
I have worked with many top corporate and agency executives during my 23 years in the HPC community, and I have met very few community leaders with the spirit, enthusiasm, and love of life that we see in Dona Crawford. From her days as one of the original leaders of the Accelerated Strategic Computing Initiative (ASCI) program, a national effort dating back to the early 90s, to her current position as Associate Director for Computation at Lawrence Livermore National Laboratory (LLNL) where she is responsible for a staff of roughly 900, she has built a tremendous following of loyal employees and close friends. I have heard numerous colleagues refer to Dona as a true leader who inspires and motivates with vision and passion. She is admired by her employees and peers, respected by her colleagues, and loved by her friends.
It is indeed a great pleasure to acknowledge and introduce you to Dona Crawford – a true Rock Star of HPC.
INSIDEHPC: You have such a rich history in this community and have been involved in so many milestone activities – what would you call out as one or two of the high points of your career – some of the things of which you are most proud?
There are a few career milestones that come to mind.
I was one of the original leaders of the Accelerated Strategic Computing Initiative (ASCI), a national effort dating back to the early 90s to provide—at that time—teraFLOPS of computing and the associated environment for nuclear weapons scientists to use computer simulations instead of conducting underground nuclear tests to certify the safety, security and reliability of the stockpile. ASCI (now known as ASC—Advanced Simulation and Computing) signified a paradigm shift in science from test-based to modeling- and simulation-based validation.
In the early days of ASCI, few computing experts believed the program would be able to take high-performance computing from 50 gigaFLOP/s to 100 teraFLOP/s in 10 years. Nobody had broken the one-teraflop barrier at the time, so it was quite a tall order. I would walk into a room with colleagues from other institutions and many would scoff at me. It was uncomfortable, but it turns out there’s something to be said for believing in a “wild” idea and fighting for its success.
ASCI is the result of a very large team of dedicated people from DOE Headquarters, industry, academia, and the labs. ASCI brought together the computing expertise of Livermore, Los Alamos, and Sandia national laboratories and established the framework for advancing computing to where it is today at the labs, with each of the national labs working in partnership with industry to pursue different hardware approaches and applications software. ASCI changed the way the world thinks about computing, and I’m proud of the small role I had in its inception. ASC today continues to push the boundaries of computational science, demanding ever more capability in computing hardware for predictive science.
As part of ASCI, I helped establish the Academic Strategic Alliances Program (ASAP). The idea was to have leading-edge universities work on large, complex, multi-disciplinary problems to validate our simulation-based approach. The Alliances were extremely successful at accelerating new developments in simulation science and high-performance technologies for computer modeling. This type of working relationship is good for the discipline and for the HPC community, and it continues today.
From the beginning, it was clear a critical component of the initiative would be making the supercomputing resources easy to access and use from remote locations. I was part of the team that created the DisCom2 (Distance and Distributed Computing and Communication) strategy, which blended two strategic thrusts. Distance Computing extended high-performance computing to remote sites, while Distributed Computing developed an enterprise-wide integrated supercomputing environment capable of supporting DOE’s science and engineering requirements. DisCom2 took advantage of the ongoing revolution in commodity- and cluster-based high-performance computing, as well as adopting and expanding on the open software approach to cluster computing. I was very interested in seeing DisCom2 come to fruition since I was located at the smaller of the two Sandia National Laboratories’ locations at the time and was leading the network research and development activities.
In 1993, I co-founded InfoTEST (née the National Information Infrastructure Testbed), which was a partnership between academia, private industry, and the national laboratories that was a precursor to the World Wide Web and Internet commerce. I was working at Sandia-Livermore and knew that to be efficient and productive, we had to have a way to access the big computers and computing resources at the other labs (Sandia/New Mexico, Livermore, and Los Alamos). For InfoTEST, I led a group that tied together distributed computing resources throughout the country and then demonstrated the feasibility and effectiveness of national-level access. This development resulted in performance data and practical experience that was critical to the establishment of the Internet. This was also one of the early efforts in network-based computing, which proved the viability of the concept for what would become known as grid computing.
I’ve also had several “firsts” as a woman in computing and management. For instance, I noticed I’m the first insideHPC female “rock star.” I was also the first mid-level and the first top-level technical female manager at Sandia. In the early 80s, I was the first technical female staff member to reduce my workweek to spend more time with my two babies at home. I worked four days a week, and even though I was still putting in 40 hours, it was considered part time. My managers were initially reticent — there was an underlying fear that all mothers would exercise this “reduced workweek” option if it were available and the workforce would therefore be reduced and part of it become less productive. Of course, none of those things happened. I only worked that schedule for four months, but I really treasured having an extra day at home with my children.
I think we’ve made progress in gender equality in computing. To be perfectly honest, I never felt there was a “glass ceiling” that I needed to push through. I just did what I was good at, worked hard, and was rewarded. But it does not go unnoticed that I am still the only woman in the room in many meetings.
INSIDEHPC: What are your thoughts on how we can attract the next generation of HPC professionals into the community – and provide them with the experience-based training that they will need to be successful?
I think it’s a combination of marketing and education. You have to put resources— time and money—into stoking the pipeline, and you need to find a way to communicate the exciting parts of what you’re doing in a way that connects with aspiring young scientists.
It’s never too early to start talking to kids and encouraging their curiosity about science and computing. There’s a great program I’m involved with called Expanding Your Horizons that encourages young women to consider careers in science, technology, engineering, and math. More of these organized efforts directed at young people are needed.
Lawrence Livermore organizes and funds an excellent summer scholar program and postdoctoral research program that is well known in academia. Our students and postdocs interact with world-renowned scientists on new areas of research and are given access to some of the most advanced computing facilities in the world. If you take the time to show the next generation a path—one that is exciting, meaningful, and has staying power—a good number of them will follow it.
INSIDEHPC: What motivates you? What is your passion?
Hands down, my passion is helping people. Luckily, in my job I get to do that in various ways, not the least of which is the fact that computing technology can transform the way we live and help improve our relationship with mother earth.
I’m passionate about sustainability. Clean air, clean water and low-carbon emitting, sustainable energy are goals of the highest order. The computing community has a tremendous capability at its disposal. We can design a model that reflects the entire earth system, not just its individual parts. And we can present the system in a manner that is understandable and even compelling to the general public. Because the changes to the earth are playing out over decades, it’s hard to comprehend and convey the need for individual and collective change today. We as a nation can’t implement the sort of changes necessary to achieve a sustainable world if we as citizens do not clearly understand the problem.
Science diplomacy is another topic I’m passionate about. I just returned from an eye-opening trip to Saudi Arabia on behalf of CRDF Global. CRDF’s objective is to advance global peace and prosperity through international scientific and technological collaboration. In my opinion, nothing but good can come from nurturing a spirit of science and technology cooperation, supporting opportunities to strengthen research and education in universities abroad, and providing critical benefits to the global community. When researchers interact on objective topics, each subconsciously learns to understand how the other feels about subjective topics. Understanding one another at different levels is what helps promote peace. The bonus is that we can help improve our global standard of living through science and technology.
INSIDEHPC: What “non-HPC” hobbies or activities do you have? If you ever really have ‘time off’ – how do you spend it?
There is nothing I’d rather be doing than spending time with my family. I have two very successful children who have always been my top priority as well as my greatest joy and source of pride. I also love to cook for and spend time with my extended family, plus I have many good, long-lasting friendships.
Additionally, I enjoy traveling and meeting people from other cultures. In those situations, I try to be respectful of their social mores.
Because I like people, I guess I have a bit of a reputation of trying to get people to “cut loose.” The way I do that is by cutting loose myself. I won’t reveal my secrets—you’ll have to ask others in the community about some of my antics.
INSIDEHPC: Approximately how many conferences do you attend each year? What would you say is your percentage of travel?
I’m selective about the conferences I attend, but I never miss the SC (Supercomputing) conference. I’ve been involved as an organizer in some way or another since 1991 and I was the general chair in 1997. I also try to attend the International Supercomputing Conference (ISC) and the Salishan Conference on High-Speed Computing.
Travel is a big part of my job. I travel about 25% of the time. I’ve been to Washington, D.C. on 11 separate trips so far in 2010. I serve on advisory committees for the National Research Council, the National Science Foundation, and the Council on Competitiveness, and I’m on the board of directors for CRDF Global. I also serve on a number of industrial advisory committees and academic or laboratory review committees. These opportunities all require a commitment to travel.
My current travel schedule is paltry compared to 20 years ago. During the early ASCI years, when I worked at Sandia-Livermore, I was traveling 48–50 weeks a year. I was leading Sandia’s effort to consolidate its Livermore computing operations with its other location in New Mexico, so I commuted by plane every week between Livermore and Albuquerque, with frequent side trips to Washington, D.C. thrown in. For 10 years, I lived mostly in hotels. I was on a first name basis with the hotel and car rental people.
INSIDEHPC: How do you keep up with what’s going on in the community and what do you use as your own “HPC Crystal Ball?”
Mark Seager is my crystal ball.
It’s easy for me to keep up with what’s going on in HPC because I have a staff of 900 absolutely brilliant people moving in many different directions, and I get all that information filtered back to me.
INSIDEHPC: What do you see as the most exciting possibility of what we can hope to accomplish over the next 5-10 years through the innovative applications of HPC?
I’m not sure it’ll happen in 10 years, but there will come a day when all the various tools and technology platforms available—our iPads, cell phones, supercomputers, televisions—will merge into one big knowledge ecosystem. Technology will change our existence in ways that we can’t foresee today. That is exciting. With technology comes knowledge, knowledge breaks down fear, and fear is what causes trouble in the world. I think the most exciting possibility is that technology will help humans become more unified.
INSIDEHPC: What are your thoughts on HPC’s ability to address what many are referring to as “the missing middle” which I loosely interpret as a broad spectrum of small and mid-size businesses. (Is HPC starting to reach a larger audience of people who previously did not have access to it?)
The first thing we have to do is understand the problems that small and mid-size businesses are interested in solving, and then figure out practical ways HPC can address their needs. The next step is forming partnerships to help overcome some of the barriers: make the infrastructure affordable and the codes easy to use. With our history of building out the technology focused on solving specific problems, we can and should provide a bridge to companies looking to do the same. The national labs will always need to push the tip of the pyramid in HPC, but if we don’t help build out the base, there is a danger the tip will topple . Many of the labs, Livermore among them, are working with a wider variety of industrial partners, one at a time, to do just that.
INSIDEHPC: What do you see as the single biggest challenge we face over the next 5-10 years?
Power. We could easily go about making the next big supercomputer by stringing a bunch of components together, but if the resulting machine requires 100MW of power to operate, that’s just not a realistic option. We have to innovate machines that require substantially less power.
INSIDEHPC: Any closing thoughts you would like to share with the HPC community?
It has never just been about creating the computer with the most superlatives attached to it; it’s the discoveries the machines make possible. Supercomputers have become the backbone of science and technology, and the simulations performed on them will enable virtually every scientific field for decades to come.
For instance, climate modelers have said they need exascale computing capabilities to achieve high-resolution coupled earth system models at 1-km resolution. Having more knowledge about climate change and its effects earlier by even a few years may well be worth a billion dollars.
China’s new supercomputer recently took the world performance lead, and the country’s government and scientists should be applauded for this remarkable achievement. Because I work at a national security laboratory, I tend to think about future challenges in terms of national security. High performance computing and simulation are essential both for national security and for industrial competitiveness in the world economy. China’s major investments in HPC show that they recognize this truth and are willing and able to focus money, energy, and creativity in this direction. There are other similarly focused efforts in Russia, Europe, and Japan.
Technology is at an inflection point where the laws of physics are dictating that we do things differently. The underlying technologies exploited this past decade to build ever-faster supercomputers are facing the end of an era of “easy” gains. Technology will change and everything that technology touches will change.
If we cede our leadership on the hardware side, it’s very likely we’ll also eventually cede our leadership in software, components, and other critical technologies that support cost-effective and powerful server and PC markets all the way down to the cell phone. There is no doubt that these technologies provide distinct advantages to the sponsoring nation.
As a nation, we’ve given a lot away; let’s at least keep our innovation. If we’re going to continue to use HPC as an economic engine for competitiveness in the global marketplace, we need focused and consistent investments in advanced computing technology.
In the realm of Rock Stars, there are One-hit Wonders, Divas, Boy Bands, American Idols, Crazy Hearts, and Legends. Our insideHPC Rock Stars are clearly an elite group of industry luminaries and thought leaders, but even among this group, few have attained the legendary status of this month’s insideHPC Rock Star.
The old timers in the community of course know Steve Wallach. As co-founder of Convex Computer Corporation, he was well respected throughout the computational science and the investment communities. His technical leadership was chronicled in Tracy Kidder’s Pulitzer Prize winning book, “The Soul of a New Machine.” Convex was eventually acquired by Hewlett-Packard, and Steve took on the role as Chief Technology Officer of HP’s Enterprise Systems Group.
Most recently, Wallach has once again drawn the spotlight as the Chief Scientist, Co-Founder, and Director of Convey Computer Corporation.
Wallach has 33 patents and is a member of the National Academy of Engineering, an IEEE Fellow, and was a founding member of the Presidential Information Technology Advisory Committee. He is the 2008 recipient of IEEE’s prestigious Seymour Cray Award.
It is with great pleasure that we present to you the newest Rock Star of HPC, Steve Wallach.
insideHPC: You have such a rich history in this community and have been involved in so many milestone activities – what would you call out as one or two of the high points of your career – some of the things of which you are most proud?
Steve Wallach: One high point in my career was starting Convex Computer, a company known for its “easy to use, affordable supercomputing” technology. The tagline when Convex started was: “A minicomputer version of a Cray, but program like a VAX.” At the time, very few people believed in the “program like a VAX” part. Today, the compiler technology that Convex developed, with the help of the late Ken Kennedy of Rice University, is considered standard.
The second high point is being able to give something back. I was a founding member of PITAC (Presidential Information Technology Advisory Committee). We helped to increase NSF budgets by hundreds of millions of dollars. Also, I’ve served on various government studies on high-performance computing. I spend lots of time in the greater DC area. One associate even went so far as to suggest that I get an apartment in DC, so it would be easier for me. I politely declined.
insideHPC: What are your thoughts on how we can attract the next generation of HPC professionals into the community – and provide them with the experience-based training that they will need to be successful.
Steve Wallach: That’s a tough one. Perhaps we should follow the lead of Apple’s App Store (or Android): Easily available and easy to use.
First of all, we need to increase the productivity of programmers. This generally means system manufactures need to employ a higher level of co-design (designing hardware and software together). I believe that, as part of all major RFPs, there should be a section on programmer productivity. There’s nothing like losing a bid because of a lack of a productive software environment to result in changes. But innovations in hardware tend to come first. Then, the software is shoe-horned in to get things to work. I refer to this as “Pornographic Programming:” You cannot define it, but you know it when you see it.
insideHPC: What motivates you? What is your passion?
Steve Wallach: When someone says “that cannot be done,” my juices flow. Too many very smart people do not get a chance to really do their thing. Startups are the major places where innovation and risks take place. My latest passion is to figure out how to use Google calendar.
insideHPC: Are there any people who have been an influence on you during your years in this community?
Steve Wallach: There are three people who have had great influence on me. One person was Ken Kennedy. The high-performance computing community lost a giant and I lost a great friend. He convinced me that a compiler could be developed that could take VAX serial code and produce vector code. And he was correct. When I came up with the idea for Convey Computer, the first person I asked for advice was Ken. I flew down to Houston and spent four hours going over the concept. At the end, he said, “This can be done, but you need a world-class compiler team.” I responded, “I know where they are.” “Where?” Ken asked. I responded: “Right where I left them.”
Another influential person was Alan Deerfield of Raytheon. Alan was a pioneer in the design of DoD specific signal processors. I learned all about: FFTs, Radar Range Gating, Kalman filtering, etc. I worked for Alan for five years in the early ‘70s. He taught me and showed me that small teams of highly motivated engineers are the most fun and accomplish the most. But you have to work hard. When motivated, working hard just comes naturally.
Lastly, Tom West of Data General showed me how to manage a group of highly motivated engineers and how to shelter the team from corporate politics. He was great at moving new products out the door, too. Tom definitely sets the gold standard for how to manage engineers. Plus, he taught me how to use high-dollar words such as “quintessential” and “canard.”
insideHPC: What “non-HPC” hobbies or activities do you have? If you ever really have ‘time off’ – how do you spend it?
Steve Wallach: I like to work out a lot – it’s kind of like training for work and keeps me mentally sharp. When I do have the time, I go to the horse track with my best buddy. I am trying to get back to my college “skill level” when shooting pool. When I can run a rack, again, I will be happy.
Also, I now have a granddaughter. Any opportunity to play with her is number one.
insideHPC: Approximately how many conferences do you attend each year? What would you say is your percentage of travel?
Steve Wallach: Let’s put it this way, in 2011, I will pass the eight-million-mile mark on American Airlines. Of course, that is not real miles traveled – that’s perhaps closer to 3.5 million real miles. And this ignores the miles on Southwest and various European and Asian airlines. My guess is that I attend, on the average, one conference a month.
insideHPC: How do you keep up with what’s going on in the community and what do you use as your own “HPC Crystal Ball?”
Steve Wallach: Attending conferences is certainly one way to see what is happening in the community. Prior to Convey, I was a contractor/consultant to Los Alamos for almost 10 years. That certainly helped me keep up. Also, I still perform due diligence for venture capitalists. Every once in a while there is a HPC type of deal, but those deals are relatively rare.
As previously described, I try to search out the HARD PROBLEMS. That is my crystal ball. And solving these problems may involve all types of technologies, hardware, software, and algorithms. I read a lot. When I find an interesting paper, I often send an email to the authors and begin a dialogue. I never know when I will use that body of knowledge.
insideHPC: What do you see as the most exciting possibility of what we can hope to accomplish over the next 5-10 years through the application of HPC.
Steve Wallach: I believe that HPC coupled with bioinformatics will lead to new ways to deal with all types of medical issues. We are already beginning to see some results. I hope one day, as described in episodes of Star Trek, we will genetically sequence a virus, take this sequence, model the behavior under certain conditions, and then synthesize a drug that hunts down the virus to destroy it.
insideHPC: What are your thoughts on HPC addressing what many are referring to as “the missing middle” which I loosely interpret as a broad spectrum of small and mid-size businesses.
Again, as I mentioned, we need an HPC App store. Any application that will aid the small and mid-size business has to be easy to use and affordable. One would also expect these applications to be available in the cloud. That is already happening. However, the user interface to these applications is different for each cloud. That slows widespread adoption.
insideHPC: What do you see as the single biggest challenge we face over the next 5-10 years?
Steve Wallach: Well, the march toward exascale computing is upon us. The consensus is that getting to exascale will NOT be as straightforward as getting to petascale. I have some thoughts on this and will be presenting those thoughts at several SC10 panels. Clearly exascale will be discussed and debated.
But the absolute biggest challenge is finding a way to get more HPC performance for less watts into our data centers. Sure, that’s related to exascale, but it goes way beyond that. If we’re ever going to solve what we used to call the “grand challenge problems,” we need some way to overcome the laws of physics that we’re facing today. By that I mean general-purpose processors just can’t get much faster because they can’t get any hotter. Today if our HPC users say “we need more performance,” we just add another 30 kilowatt rack on the datacenter floor in an attempt satisfy them.
The only way to do that is with heterogeneous computing, or to be more specific, with application-specific hardware. That means designing instruction sets that are absolutely specific to a particular application or class of applications. Which is why technology like FPGAs and GPGPUs are such a hot topic today—we’re all looking down the road and saying, “Where is this performance going to come from?”
And another thing, related to that, is that we do not need new languages. We need extensions to existing languages (like Fortran and C) that reflect the changes in computer architecture. In general, these are extensions that reflect the memory hierarchy within a node and the hierarchy among nodes. Please do not interpret this to mean I am against language research. I believe the results of this research can be reflected within current languages.
I do have one hot button in this area. I believe that MATLAB (MathWorks) is the easiest to use and the most productive HPC language. I use it all the time on my laptop. Users who are willing to accept a two-to-four reduction in performance (relative to Fortran or C) can gain an order of magnitude more user productivity. So, a great example of application-specific computing would be to build MATLAB machines to eliminate this imbalance.
In my humble opinion, cloud computing is time-sharing in a contemporary architecture. There are many models of cloud computing, thus there is no simple answer to cloud computing delivering real value. For many users and companies, having a shared resource, not having to deal with system administration and facilities is a very big win. In this area, cloud computing provides real value. Additionally, having access to additional resources for spikes in computing needs is a big win.
This series is about the men and women who are changing the way the HPC community develops, deploys, and operates the supercomputers we build on behalf of scientists and engineers around the world. John Shalf, this month’s HPC Rock Star, leads the Advanced Technology Group for Lawrence Berkeley National Lab, has authored more than 60 publications in the field of software frameworks and HPC technology, and has been recognized with three best papers and one R&D 100 award.
Among the works he has co-authored are the influential “View from Berkeley” (led by David Patterson, and others), the DOE Exascale Steering Committee, and the DARPA IPTO Extreme Scale Software Challenges report that sets DARPA’s information technology research investment strategy for the next decade.
He also leads the LBNL/NERSC Green Flash project — which is developing a novel HPC system design (hardware and software) for kilometer-scale global climate modeling that is hundreds of times more energy efficient than conventional approaches — and participates in a large number of other activities that range from the DOE Exascale Steering Committee to Program Committee Chair for SC2010 Disruptive Technologies exhibit.
Shalf’s energy and dedication to HPC are helping to actively shape the future of HPC, and that’s what makes him this month’s HPC Rock Star.
insideHPC: How did you get started in HPC?
John Shalf: I spent a lot of time as a kid hanging out in the physics department and computing center at Randolph Macon College (RMC) in Ashland, where I grew up. The professors there gave me (and other neighborhood kids) accounts on their IBM mainframe and Perkin-Elmer unix minicomputer, and access to the supply rooms behind the classrooms were there were hundreds of computing technology artifacts such as 3D stacked core memories from old IBM systems, and adders constructed using vacuum tube logic. My friends and I spent a lot of time in the summers and after school in 4th thru 6th grade, exploring the back rooms and having the professors patiently explain what we were looking at and how it worked. We also got our first taste of the UNIX operating system and CRT terminals, albeit we learned more about playing Venture (a text video game) than programming.
When I was about 11, Dr. Maddry offered to teach me how to build a computer in exchange for my help cleaning up his lab during the summer. We actually had a race where he built a computer using a Z80 chip, and I built my computer using an 8080a. Both of our computers had 128 bytes (yes bytes… not kilobytes) of memory, ran at 500khz (could run faster if you turned off the fluorescent lights In the room), and was programmed using a set of dip-switches on the board. I still remember the 8212 tristate latches and the TTL discrete logic chips we needed to glue everything together. I had a blast building it, and just as much fun programming it, despite the rudimentary nature of the user interface, and the low-resolution of the display system (12 LED’s lined up in a row to show the data and the memory addresses). After that, I become hooked on computer architecture and machine design.
In college, I took my first HPC course and become interested in parallel computation. Where we got accounts on the HPC systems (Cray vector and IBM) at the NSF supercomputing systems. I was particularly fascinated with Thinking Machines systems, but also learned a lot about dataflow computing. Around this time, I collected many old machines through surplus auctions as well to learn how they worked. I had quite a collection of PDP-8s and PDP-11s, and started the Society for the Preservation of Archaic Machines (SPAM). The chemistry department maintained many PDP’s for their experiments, so they became a resource for manuals, circuit diagrams, advice on machine repair, and a FORTH interpreter that ran on top of RSTS.
During this time, I also discovered Ron Kriz’s vislab, where I developed an interest in computer graphics and visualization as another way to interact with the HPC community. Whereas I had been connected to computing only through my study of computer architecture and programming, the vislab and working on programming / optimization of material science codes for the Engineering Science and Mechanics (ESM) department opened me up to direct collaboration with science groups. It was there that I learned that the interdisciplinary collaborations in HPC is where the rubber hits the road. That the pursuit of answers to scientific grand-challenges required such broad-based collaborations is what makes “supercomputing” so exciting.
insideHPC: What would you call out as one or two of the high points of your career — some of the things of which you are most proud?
Shalf: My first real job in HPC was at NCSA, where I divided my time between NCSA’s HPC consulting group (led by John Towns), Ed Seidel’s General Relativity Group, and Mike Norman’s Laboratory for Computational Astrophysics. This was the golden years for NCSA and the NSF HPC Centers program as well. NCSA Mosaic was just getting popular. I got to work on HPC codes on a variety of platforms. The LCA was developing its first AMR codes (Enzo). I got to learn how to work on virtual reality programs in the CAVE, and participated in national-scale high-performance networking test beds for the SC1995 IWAY experiment. There was such a wide variety of computer architectures — Cray YMP, a Convex C3880, and a Thinking Machines CM5.
What an amazing time!
It was also a time of great transition because it was clear that our vector machines were going to be turned off eventually and replaced by clusters of SMPs (SGI’s and Convex Exemplars initially, followed by clusters). It’s very similar to what is happening to the HPC community today as we transition to multicore. It was an exciting time to start in HPC. There were new languages like HPF, messaging libraries like PVM and P4, and MPI. It was unclear what path to take to re-develop codes for these emerging platforms, so we tried all of the options using toy codes. Everyone was busily creating practice codes to try out each of these emerging alternatives to re-develop their entire code base to survive this massive transition of the hardware/software ecosystem.
The first few implementations of the parallel codes worked, but revealed serious impediments to future/collaborative code development. When Ed Seidel’s group moved to the Max Planck Institute in Potsdam Germany, Paul Walker and Juan Masso hatched a plan to create a new code infrastructure, called Cactus, to combine what we’d learned about how to parallelize the application efficiently and hide the MPI code from the application developers with clever software engineering to support collaborative/multidisciplinary code development. Cactus was so titled by Paul because it was to “solve thorny problems in General Relativity”. I had a huge amount of fun developing components for the first versions of Cactus, which is still used today (www.cactuscode.org). We had a huge sense of purpose and dedication to the development of Cactus infrastructure — creating advanced I/O methods, solver plug-ins, remote steering/visualization interfaces, etc. I continued to work with subsequent Cactus developers (Gabrielle Allen, Tom Goodale, Erik Schnetter, and many others) many years after leaving Max Planck to extend it for Grid computing and new computing systems. One of the first things the group did when I came to LBNL was to run the “Big Splash” calculation on the NERSC “Seaborg” system, of inspiraling colliding black holes. The calculation was ground-breaking, in that it disproved a long-held model for initial conditions for these inspiraling mergers, and its demonstration of what you could do with large scale computing resources ultimately spawned the DOE “INCITE” program. The work with the Cactus team is one of the highlights of my career, even though there was a cast of hundreds contributing to its success.
The Green Flash project is also one of the projects that has been a lot of fun. Like Cactus, there are a large number of people working on different aspects of this multi-faceted project. I definitely love this kind of broad interdisciplinary work. We get to re-imagine computing architecture, programming models, and application design massively parallel chip architectures that we anticipate will be the norm by 2018. Our multi-disciplinary team is on the forefront of applying co-design processes to the development of efficient computing systems for the future. There are a lot of similarities between the move towards manycore/power-constrained architectures and the massive disruptions that occurred at the start of my career when everyone was moving from vectors to MPPs. It is exciting to have such an open slate for exploration, and a time for radical concepts in computer architecture to be reconsidered.
insideHPC: What do you see as the single biggest challenge we face (the HPC community) over the next 5-10 years?
Shalf: The move to exascale computing is the most daunting challenge that the community faces over the next decade. If we do not come up with novel solutions, then we will have to contend with a future where we must maintain our pace of scientific discovery without future improvements in computing capability.
The exascale program is not just about “exaFLOPS,” it’s about the phase transition of our entire computing industry that affects everything from cell phones to supercomputers. This is as big a deal as the conversion from vectors to MPI two decades ago. We cannot lose sight of the global nature of this disruption — that is not just about HPC. DARPA’s UHPC program strikes the right tone here. We need that next 1000x improvement for devices of all scales. Until recently we have been limited by costs and chip lithography (how many transistors we could cram onto a chip), but now hardware is constrained by power, software is constrained by programmability, and science is squeezed in between. Even if we solve those daunting challenges, science may yet be limited by our ability to assimilate results and even validate those results.
I think there is a huge problem with us conflating success in “exascale” with the idea that the best science must consume an entire exascale computing system (the same is true to some extent with our obsession with scale for “petascale.”). The best science comes in all shapes and sizes. The investment profile should be more balanced towards scientific impact (scientific merit, whether it is measured in papers or US competitiveness). There is a role for stunts to pave the way to understand how to navigate the path to the next several orders of magnitude of scaling. But the focus should definitely be more on creating a better computing environment for everyone — more programmable, better performing, and more robust.
We do have a tendency to say that the solution to all of our programmability problems is just finding the right programming model. This puts too much burden on language designers and underplays the role of basic software engineering for creating effective software development environments. Dan Reed once said that our current software practices are “pre-industrial,” where new HPC applications developers join the equivalent of a “guild” to learn how to program a particular kind of application. Languages and hardware play a role (just as the steam engine played a role in the start of the industrial revolution), but software engineering and good code structures that clearly separate the roles of CS experts from domain scientists (frameworks like Cactus, Chombo, and Vorpal) and algorithm designers are also critical areas that often get under-appreciated in the development of future apps.
insideHPC: How do you keep up with what’s going on in the community and what do you use as your own “HPC Crystal Ball?”
Shalf: For hardware design and computer science, attending many meetings to interact with the community plays an essential role in gauging the zeitgeist of the community. Given the huge amount of conflicting information, you need to talk to a lot of people to get a more statistical view of what technology paths are actually practical and what is just wishful thinking. Getting someone to talk over a beer is always more insightful for the “HPC Crystal Ball” than simply accepting their PowerPoint presentation or paper at face value. You have to constantly look at what other people are doing.
I’ve always enjoyed the SIAM PP (SIAM Conference on Parallel Processing for Scientific Computing) and SIAM CSE (SIAM Conference on Computational Science and Engineering) meetings as a great source for seeing ideas that are still “in progress.” Normally, conferences have a strict vetting process for papers. The presented work is usually thoroughly vetted and mostly complete. There is little opportunity to drastically change the direction of such work. However, the SIAM meetings support having people getting together through mini-symposiums to discuss work that is still in progress, and in some cases, is not fully baked. This is where there is a real exchange of wild ideas and new ways of thinking about solving problems. I think there is a role for both types of meetings, but I definitely see more of the pulse of the community in the SIAM mini-symposiums.
I also find that journals that are targeted more at domain scientists have a lot of information about future directions of the community. You quickly find out what is important and why. More importantly, you learn the vocabulary to actually communicate with scientists about their work.
insideHPC: What motivates you in your professional career?
Shalf: Scientists like to do things because they are interesting. Engineers like to do things that are “useful”. I’m an engineer who likes to hang out with scientists to get a bit of both the “interesting” and the “useful.” If I can do things that are both interesting AND useful, I’m very happy.
There is a recent article in Science Magazine (Vol 329, July 16, 2010) entitled “learning pays off.” It showed research that people who went into science because they were excited by the science, and not simply because they were good at math, were the most likely to continue in the field. This makes total sense to me. I’m just a science geek. I’m not a scientist or physicist by training, but I love to read Science and Nature magazine from cover to cover whenever a new one arrives. I just love to learn new things and explore. Supercomputing is a veritable smorgasbord of ideas and different science groups. The deeper I dive into my professional career, the more I learn and the more people I meet who have radically different perspectives on computing and in science. It’s so much fun to learn something new every day.
It’s also fun trying to be the man-in-the-middle to communicate between people with disparate backgrounds. Because of my diverse interests, my career has run the gamut from Electrical Engineering and computer hardware design, to code development for a scientific applications team, to computer science, and then back again to hardware design. I remember the perspective I had when I was in each of those different roles (when in EE, I thought the scientists were all just bad programmers, and when working for the apps group, I thought the hardware architects were just idiots who would not listen to the needs of the application developers). All of the interesting things happening in supercomputing are happening in the communication between these fields, and I love to be there, right in the middle. This is why co-design has become such a popular term: it’s where all of the action is today.
insideHPC: Are there any people who have been an influence on you during your years in this community?
Shalf: Many, many people. Nick Liberante, and English professor with uncompromising standards for excellence, taught me how to organize thoughts for writing, and the importance of memorization to facilitate that organization process. Ron Kriz taught me the value of persistence, collaboration across multiple disciplines, and to be undaunted by the challenges of new and rapidly evolving technologies. Ed Seidel has had a huge influence on my career by launching me into the HPC business and teaching me how far you can push yourself if you set seemingly unrealistic stretch goals. Ed and Larry Smarr, Maxine Brown, and Tom Defanti demonstrated the power of demonstrating the “seemingly impossible” is within our grasp through ambitious demonstrations like the SC95 IWAY. Donna Cox taught me the magic that can result from bringing both scientists and artists together (seemingly disparate groups) to create powerful communication media. Tom Defanti taught me the importance of articulating what I want to do (either by writing, or presenting to others) by saying “It’s not a waste of time if you have the right attitude. You are writing the future.” He also showed me how we can reinvent ourselves to take on new challenges as he went from CAVE VR display environments and jumped in to high performance international optical networking.
insideHPC: What type of ‘volunteer’ activities are you involved in — both professional activities within the community, and personal volunteer activities.
Shalf: I would say I’ve gotten way over committed to SC-related volunteer activities. In the past, I’ve spent some time helping with the LBL summer high-school students program. This year, I’ve gotten completely immersed in participating in the program committees and organization of HPC-related conferences. I’m on the program committees for IPDPS, ISC, ICS, and SC. It’s fun to participate in the organization and planning of so many different conferences, but it’s a lot of work. I would like to get back to working with the high school and undergraduate students to get them excited about this field.
insideHPC: How can we both attract the next generation of HPC professionals into the community, and provide them with the experience-based training that they will need to be successful.
Shalf: Well, first we should call it “supercomputing” rather than HPC if we want to attract new talent. It sounds interesting when a high-school kid says they want to work on supercomputers. If they say they want to work on High Performance Computing, they’ll have their underwear pulled up around their ears by the class bully in no time.
I ended up in this field because of the patience of a few physics professors at RMC when I was growing up. There is no degree in supercomputing (or HPC) because the field is fundamentally interdisciplinary. So you have to catch kids early to get them excited about the breadth of experiences that supercomputing can offer.
Closing Comments from John Shalf
We are back in a transition phase for our entire hardware/software ecosystem that is much like the transition we made to MPI. Times of disruption are also great times of opportunity for getting new ideas put into practice. The world is wide open with possibilities. It’s a great time to be involved in computing research.
This month’s HPC Rock Star is Marc Snir. During his time at IBM, Snir contributed to one of the most successful bespoke HPC architectures of the past decade, the IBM Blue Gene. He was also a major participant in the effort to create the most successful parallel programming interface ever: MPI. In fact Bill Gropp, another key person in that effort, credits Snir with helping to make it all happen, “The MPI standard was the product of many good people, but without Marc, I don’t think we would have succeeded.”
Today Snir is the Michael Faiman and Saburo Muroga Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign, a department he chaired from 2001 to 2007. With a legacy of success in his portfolio, he is perhaps busier today than ever as the Associate Director for Extreme Scale Computing at NCSA, co-PI for the petascale Blue Waters system, and co-director of the Intel and Microsoft funded Universal Parallel Computing Research Center (UPCRC). Trained as a mathematician, Snir is one of the few individuals today shaping both high end supercomputing and the mass adoption of parallel programming.
Marc Snir (Erdos number 2) finished his PhD in 1979 at the Hebrew University of Jerusalem. He is a fellow of the American Association for the Advancement of Science (AAAS), the ACM, and IEEE. In the early 1980s he worked on the NYU Ultracomputer Project. Developed at the Courant Institute of Mathematical Sciences Computer Science Department at NYU, the Ultracomputer was a MIMD, shared memory machine whose design featured N processors, N memories, and an N log N message passing switch between them. The switch would combine requests bound for the same memory address, and the system also included custom VLSI to interleave memory addresses among memory modules to reduce contention.
Following his time at NYU, Snir was at the Hebrew University of Jerusalem from 1982-1986, when he joined IBM’s T. J. Watson Research Center. At Watson he led the Scalable Parallel Systems research group that was responsible for major contributions — especially on the software side — of the IBM SP scalable parallel system and the IBM Blue Gene system. From 2007 to 2008 he was director of the Illinois Informatics Institute, and he has over 90 publications that span the gamut from the theoretical aspects of computer science to public computing policy. Microsoft’s Dan Reed has said of Snir that, “Marc has been one of the seminal thought leaders in parallel algorithms, programming models and architectures. He has brought theoretical and practical insights to all three.”
As readers of this series will know, a Rock Star is more than just the sum of accomplishments on a resume. We talked with Dr. Snir by email to get more insight into who he really is, what he thinks is important going forward, and what has made him so influential.
insideHPC: You have a long history of significant contributions to our community, notably including contributions to the development of the SP and Blue Gene. I was familiar with your MPI work, but not the SP and BG work while you were at IBM. Would you talk a little about that?
Marc Snir: The time is late 80’s. The main offering of IBM in HPC was a mainframe plus vector unit — not too competitive. Monty Denneau in Research had a project to build a large scale distributed memory system out of Intel 860 chips called Vulcan. His manager (Barzilai) decided to push this project as the basis for a new IBM product. This required changes in hardware (to use Power chips) — ironically, the original 860 board became the first NIC for the new machine — and also required a software plan, as well as a lot of lobbying, product planning, and so on. I got to manage the software team in Research that worked in this first incarnation of this product, developing communication libraries (pre-MPI), the performance visualization tools, a parallel file system, and other aspects of the final system.
There was a lot of work to convince the powers that be in IBM to go this way, because at the time IBM mainframes where still ECL, a lot of joint work with a newly established product group to develop an architecture and a product plan, and to do the first development in Research and transfer the code to development (Kingston and, later, Poughkeepsie). All of this work turned in the IBM SP, and the SP2 followed — it was the first real product and quickly became a strong sales driver for IBM. I continued to lead the software development in Research, where we did the first MPI prototype, created more performance tools, and did work on MPI-IO and various applications.
Blue Gene is a convoluted story (the Wikipedia entry is incorrect — I need to find time to edit it). At the time IBM had two hardware projects. One developing Cyclops, headed up by Monty Denneau. Cyclops was a massively parallel machine with heavily multithreaded processors and all memory on chip. The other project was to develop a QCD (quantum chromodynamics) machine based on embedded PPC processors under the direction of Alan Gara.
IBM Research was looking to push at the time highly visible, visionary projects. I proposed to take the hardware that Monty was building, with some modifications, and use it as a system for molecular dynamics simulation (if you wish, an early version of the Anton machine of D. Shaw). IBM announced, with great fanfare, a $100M project to build this system, and called it Blue Gene.
I coordinated the work on BG and directly managed the software development. In the meantime, Al Gara worked to make his system more general (basically, adding a general routing network, rather than nearest neighbor communications) and started discussing this design with Lawrence Livermore National Lab’s Mark Seager. Seager liked it and proposed to fund the development of a prototype. At that point, the previous Blue Gene became Blue Gene C while the system of Al Gara became Blue Gene L (for Light). After a year BGC was discontinued — or, to be, more accurate, heavily pared down (Monty has continued his work), and BGL evolved into BGP, and then BGQ. I helped Al Gara with some of his design — in particular with the IO subsystem, and with much of the software — and my team developed the original software both for Blue Gene C and for Blue Gene L.
insideHPC: Looking at your career thus far, do you have a sense that one or two accomplishments were especially significant professionally, either in terms of meeting a significant challenge or really spurring the community in a new direction?
Snir: I have had a fairly varied career. I started by doing more theoretical research. My first serious publication in 1982 is a 55-page journal article in the Journal of Symbolic Logic on Bayesian induction and logic (probabilities over rich languages, testing and randomness). It is still being cited by researchers in logic and philosophy. This is a long-term influence on a very small community with (I believe) deep philosophical implications, but no practical value.
Some of my early theory work has been somewhat ahead of its time, and continue to be cited long after publication. A paper on how to ensure sequential consistency in shared memory systems (Shasha and Snir) has been the basis for significant work for the compilation of shared memory parallel languages. I recently learned that a paper I published in 1985 on probabilistic decision trees is quite relevant to quantum computing — indeed it had some recent citations; I had to re-read it to remember what it was about. While many of my theory publications are best forgotten, some seem to have a long-term value.
My applied research has been (as it should be) much more of a team effort — so whatever credit I take, I share with my partners. Pushing IBM into scalable parallel systems (as we called them, i.e., clusters) was a major achievement. Basically, we needed to conceive a complete hardware and software architecture, and execute with a new product team — essentially work in startup mode. That probably was the most intensive time in my career. Pushing Blue Gene was also quite intense. I probably wrote down half of the MPI standard — that’s another type of challenge: thinking clearly about the articulations of a large standard and convincing a large community to buy into a design. As department head in CS at U of Illinois I faced quite intensive but quite different challenges: growing a top department (from 39 to 55 faculty in 6 years), improving education, and changing the culture. Getting Blue Waters up and running at NCSA (developing the proposal, nailing down the software architecture, pushing for needed collaborations with IBM, etc.) has a similar flavor. I think that I feel the need to push large projects.
I realize that’s more than two, but I like all of them. If I have to pick, I’d pick the IBM SP product, just because it was the most intensive project, and the one that required the most “design from scratch,” with little previous experience. It also was an unqualified success.
insideHPC: If you were to answer that same question about the one or two accomplishments that mean the most to you personally, are they the same?
Snir: Well, I have very successful children and very good friendships. This means a lot, personally. But I must confess that, family and friends aside, professional achievement is what I care about.
insideHPC: You’ve spent time as a manager and department head, and time as an individual contributor. Is there one of those roles that you think fits your personal style or the kind of contributions you want to make? Asked another way, some people add the most value by doing, and others by creating an environment in which other people can do: which fits you best?
Snir: Hard choice. As a manager or department head I have done much more of the latter, creating an environment where other people can do. My individual contributions, especially in theory, are the former. I would say that I prefer the latter; doing is more of a hobby, a way not to loose contact with reality. Getting others to do is a way of achieving much more.
insideHPC: You’ve talked about the need for effective parallelism to be accessible by everyone, but some argue that parallel programming is fundamentally hard and that you can either have efficient execution or ease of expression, but not both. Do you agree? Is this purely a software and tools problem, or is there a hardware component to the answer?
Snir: Processors have become more complex over the years, and software has not been too successful in hiding this complexity: It is increasingly easy to “fall of the cliff” and see a significant performance degradation due a seemingly innocuous change in a code. Parallelism is one additional twist to that evolution, there is no doubt about it. Small changes in a code can make the difference between fully sequential and fully parallel. Also, there is no doubt that there is a tradeoff between performance and ease of programming: people that care about the last ounce of performance (cache, memory, vector units, disk) have to work hard on single core systems and slightly harder on multicore systems. On the other hand, parallelism can be used to accelerate easy to use interfaces — e.g., Matlab, or even Excel, and can be used for bleeding-edge HPC computations.
The only fundamentally new thing is that application developers that want to see a uniprocessor application run faster from microprocessor generation to another need to learn now about parallelism. This is a new (large) population of programmers, and this is the focus of UPCRC Illinois.
insideHPC: Parallelism (of the kind exposed to developers) at much less than supercomputing scale is a relatively new thing for developers. For decades the majority of applications have been developed for desktop boxes, with very few people working on software for large scale parallel execution. Today we have parallelism even in handheld devices, and the high end community is contemplating O(1B) threads in a single job. Is there a chance that the work to develop tools for “commodity” parallel programming will make high end programming easier, or are these fundamentally different communities? If different, what are some of the essential differences?
Snir: The HPC software stack has been always developed by extending a commodity software stack: OS, compiler, etc. Now, the HPC software stack will be built by extending a (low-end) parallel software stack. I am inclined to believe that this will make the extension easier. There is also much cloud computing technology to be reused; i.e., system monitoring and error handling in large systems. Not much of this has happened, and I expect that the effect will be relatively marginal. As for the essential differences, this reminds me of the famous, apocryphal dialogue between Fitzgerald and Hemingway:
Fitzgerald: The rich are different than you and me.
Hemingway: Yes, they have more money.
Large machines are different because they have many more threads; HPC is different from cloud computing because its applications are much more tightly coupled. Sufficient quantitative differences become qualitative at some point.
insideHPC: I have been challenged at a couple events where I have spoken lately about the necessity of getting to exascale, and the draining effect it is having on computational funding for other projects. Is it necessary that we push on to the exascale? If so, why not take a longer trajectory to get there. Why is the end of this decade inherently better for exascale than the middle of the next?
Snir: Good question, and a question one might have asked at any point in time. It is not for supercomputing aficionados to make the case for exascale in 2018 or 2030; it is up to different application communities to make the case of the importance of getting some knowledge earlier. Having more certainty about climate change and its effects earlier by a few years may be well worth a couple of billion dollars — but this is not an arithmetic I can make; similarly for other applications.
There is another interesting point: Moore’s law is close to an inflection. ITRS (the International Technology Roadmap for Semiconductors) predicts a slowdown (doubling every 3 years) pretty soon; nobody has a path to go beyond 8 nm. Given that 8 nm is only a few tens of silicon atoms, we may be hitting a real wall next decade. There is no technology waiting to replace CMOS, as CMOS was available to replace ECL. This will be a major game changer for the IT industry in the next decade: The game will not be anymore finding applications that can leverage the advances of microelectronics, but getting more computation out of a fixed power and (soon) transistor budget. I call research on exascale “the last general rehearsal before the post-Moore era.” Exascale research will push on many of the research directions that are needed to increase “compute efficiency.” Therefore, I believe it is important to push this research now.
insideHPC: Thinking about exascale, there seems to be broad agreement that it isn’t practical to build such a system out of today’s parts because of energy impracticalities. But when it comes to programming models, some people seem to favor an incremental evolution of the same model we use today (communicating sequential processes with something like MPI), while others want to totally start over (e.g., Sterling’s ParalleX work). I’ve been personally surprised by how well the current model extended to the petascale. What are your thoughts about evolution versus revolution in exascale programming approaches?
Snir: When I was involved with MPI almost 20 years ago, I never dreamed it would be so broadly used 20 years down the road. Again, this is not a black and white choice: One can replace MPI with a more efficient communication layer for small, compute intensive kernels while continuing to use it elsewhere; one can take advantage of the fact that many libraries (e.g., Scalapack) are hiding MPI behind their own, higher-level communication layer, to re-implement their communication layer on another substrate; one can preprocess or compile MPI calls into lower-level, more efficient communication code. One can use PGAS languages which, essentially are syntactic sugar atop one-sided communication. We shall need to shift away from MPI for a variety of reasons that include performance (communication software overhead), an increasing mismatch between the MPI model and the evolution of modern programming languages, the difficulties of working with a hybrid model, etc. The shift can be gradual — MPI can run atop ParalleX. But we have very few proposals so far for a more advanced programming model.
This series is about the men and women who are changing the way the HPC community develops, deploys, and operates the supercomputers we build on behalf of scientists and engineers around the world.
As part of the team that explored the Beowulf computing concept, Thomas Sterling has already revolutionized the high performance computing community once. But the author of six books and a raft of journal and conference publications isn’t ready to leave the hard work of change to someone else. His sights are set now on changing the way we use and build the supercomputers of today and the exascale monsters of tomorrow.
insideHPC: You have such a rich history in this community and have been involved in so many milestone activities — what would you call out as one or two of the high points of your career — some of the things of which you are most proud?
Thomas Sterling: In the broadest sense, I am most proud — or perhaps I should just say “grateful” — for being allowed to continuously stay in and contribute to the field of HPC for three decades since my graduate work at MIT. But if I had to pick two high points they would be Beowulf and ParalleX; the former more than a decade ago and the latter of current active engagement.
Beowulf was an experiment that explored the potential of low cost hardware and software in ensembles to perform real world science computation at unprecedented performance to cost. It was one of several projects in what has become known as “clusters” but I think had significant impact on the community. Today on the Top-500 list 1) the most widely used processor architecture is X86, 2) the most widely used network is Ethernet, 3) the most widely used OS is Linux, 4) the most widely used system configuration is the commodity cluster, and 5) the most widely used programming model is distributed memory message passing. Beowulf was the first research project to implement and explore this synthesis of elements, many in their inchoate phase. We recognized that community education was as critical as technology in accomplishing this paradigm shift and therefore through a series of tutorials and books we provided easy access to this approach. Admittedly I had no idea that it would dominate HPC. In some sense I was just lucky to be at the right place at the right time with a budget, a need, and a good idea.
ParalleX reflects the next HPC paradigm shift or more accurately is an exploratory project of key system elements from semantics through functionality to structures and mechanisms that in synthesis is catalyzing the 6th Phase of HPC.
insideHPC: As the co-author of six books, (an accomplishment in its own right), would you talk a little bit about your commitment to education: how we can both attract the next generation of HPC professionals into the community, and provide them with the experience-based training that they will need to be successful?
Sterling: As a faculty member in a computer science department at a major state university, I have become keenly aware of the challenge of education to attract and train the next generation to the field of HPC.
I teach an introduction course to HPC at the first year graduate and senior undergraduate level (LSU CSC-7600). This course is a bit different from your conventional parallel programming course because it provides strong cross cutting themes of parallelism, performance, and system structures that determine the effectiveness of scalability on real-world systems. The students learn how to program in three different modalities (capability, cooperative, capacity) and corresponding programming interfaces (OpenMP, MPI, Condor) along with related system architectures (SMP, clusters, farms/clouds).
But such courses are not readily available at all of the thousands of colleges across the country. I hate the fact that a young man or women is robbed of the choice of personal goals and opportunities simply due to their demographics and socio-economic circumstances. To counter this truly unfortunate aspect of the American condition, I have been working, perhaps inadequately, in the realm of distance learning, exporting the course in real-time in high definition to a few remote campuses. My external partner in this has been Professor Amy Apon at University of Arkansas and my colleagues at LSU have been Chirag Dekate and Hartmut Kaiser (among others), in combination with a staff of technology types led by Ravi Parachuri. Petre Holub at Masaryk University in the Czech Republic was the force behind the high definition stuff strongly driven by Ed Seidel and Steve Beck, directors at the Center for Computation and Technology at LSU.
My next book will be my first textbook for this course; it’s a lot harder to write a textbook than my previous efforts. Jose Munoz has been an advocate of this work, and we hope to expand this to other communities. We have run a “Beowulf Bootcamp” for two summers that involved high school students to get them excited about going to college, and therefore (hopefully) to finish high school. With a dropout rate of one third such as that in Louisiana, we need to find ways to motivate our kids to aspire and excel. We should do an entire interview just on this topic. It’s so important and so inadequately addressed at this point.
insideHPC: Anyone researching your background can’t help but notice the long list of volunteer activities through which you have selflessly served this community. Why do you do that? I know from personal experience, there is seldom any real recognition for this type of service — the reward has to be internal. What motivates you to be so active in supporting the HPC community through these activities?
Sterling: Community engagement is critical to the success of the HPC field. In a sense, any discipline that is system oriented requires a community-system approach as it must engage the diversity of talents, expertise, and resources reflected by the many sub disciplines defining the diversity of components, technologies, and methodologies integrated within the single complex system. Therefore, it is out of necessity that one participate in, and sometimes lead, community-led forums that focus on the many enabling leading-edge issues.
When I and my colleagues conducted a number of meetings and tutorials on early Beowulf cluster implementation and application, we were doing this out of a necessity for technology transfer. When I and colleagues conducted a number of workshops related to Petaflops computing, then orders of magnitude away from contemporary capability, this was genuine pursuit of knowledge, perspectives, and concepts. Today, there is a dramatic surge in the domain of Exascale computing led by DOE and DARPA with strong NSF participation as well. These are important exploratory meetings both devising and guiding future work towards this challenging goal less than a decade away. I have been fortunate to be included in these initiatives.
Finally, I am honored by the number of presentations at conferences and workshops I am invited to give, and it is a pleasure to serve the community to the best of my ability in this way. In particular, I have enjoyed providing a presentation every year at the ISC in Germany on summarizing the accomplishments in the field of HPC during the previous year. This June will mark the seventh such talk, and I am grateful to Hans Meuer and the other program organizers for this opportunity. I guess, to be honest, it’s part of the fun.
insideHPC: Are there any people who have been an influence on you during your years in this community?
Sterling: Nothing is more humbling than reflecting on all of the colleagues who have contributed to one’s own accomplishments and in my case there have, and continue to be, many. To note any would be to fail to identify so many others. But with that acknowledgement of inadequacy, allow me to recognize a few who have had appreciable impact in chronological order:
Bert Halstead was my doctoral thesis advisor at MIT, and it was from him that I learned the critical importance of deep-thinking in the intellectual arenas of abstraction and models of computation, not just as a mental exercise but as important tools to innovation.
Jim Fischer of NASA Goddard Space Flight Center taught me the importance of enlightened but responsible management as a key element of collective achievement in HPC system advancement to serve science applications. It was he who empowered the Beowulf Project in the face of strong resistance, and prevailed.
Paul Messina, formerly of Caltech and currently at Argonne National Laboratory, has been my mentor in working within the HPC community, supporting it and being supported by it, and complementing individual accomplishment through the leveraging of group engagement. He led the CCSF initiative that in 1991 deployed the Intel Touchstone Delta at Caltech, which was the fastest open HPC system in the world at the time and the prototype of the successful line of Intel Touchstone Paragon computers of the 1990s. He and I co-authored my first two books together.
Larry Bergman of the Jet Propulsion Laboratory has been among my most important collaborators over almost a decade of accomplishment, performing at different times as my boss, my program manager, and my research partner. Without Larry, a decade of accomplishment in my professional career would most likely not have occurred. It was from him that I learned my limitations as he seamlessly complemented my strengths with his own to form an effective working partnership driving pursuit of advancement in HPC.
Ed Seidel created the new LSU Center for Computation and Technology that embodied a unique melding of resources at the state and federal level enabling aggressive and innovative HPC research both in end computational science and systems technology (software and hardware). It was within the context of this environment and the opportunities that it afforded that I have been able to conduct my most recent explorations and endeavors. At LSU CCT I am fortunate to work with a small group of research scientists who are making possible these researches of the new horizon of HPC: Maciej Brodowicz, Hartmut Kaiser, and Steve Brandt.
As pivotal as these people have been to me at different stages of my career, in many cases providing real role models as well, there is a group of colleagues who have and are both contributing to the field of HPC and to my own work as well. Since they are all very well known to your readership I identify them in no particular order without explanation: Bill Gropp, Bob Lucas, Dan Reed, Guang Gao, John Salmon, Al Geist, Jack Dongarra, Bill Carlson, Kathy Yelick, Horst Simon, Almadena Chtchelkanova, Thomas Zacharia, Burton Smith, Marc Snir, Pete Beckman, Hans Meuer, Jose Munoz, Peter Kogge, Fred Johnson, George Cotter, Bill Dally, Rusty Lusk, Bill Harrod, and Paul Saylor.
Oh, yes, and of course there was Don. What can I say?; without him you would not be conducting this interview in all likelihood. It’s one of those strange things, a chance meeting (at MIT, he was a freshman and I a finishing doctoral student) which could easily never have occurred and yet one’s life changes. Don Becker (now CTO at Penguin) and I collaborated on a number of projects but it was he who developed the first Beowulf systems with a group of young highly motivated implementers to realize my system concept and architectural strategy.
insideHPC: What “non-HPC” hobbies or activities do you have? If you ever really have time off — how do you spend it?
Sterling: There is very little time for extra-professional pursuits but I do, when time permits, engage in three activities beyond HPC:
Sailing is the only activity that I can get involved in during which I truly forget about work. Of course I achieve this also during very scary airplane landings in thunderstorms, but this is not by choice so it doesn’t count as a hobby.
The study of history, in particular the 3rd Millennium BCE which is the late Bronze Age. I am fascinated with how small groups of people catalyzed in to large aggregations of what we would recognize as civilization enabled or driven by technological advances and ad hoc experiments in political science.
Got to love it; the brain fascinates me and for no useful purpose over the last decade I have found myself pursuing knowledge related to brain structure, function, and emergent behavior. Thinking about thinking, nature’s own recursion. I suppose Cognitive Science thinks about the process of thinking about thinking. But I’m not there yet.
insideHPC: Approximately how many conferences do you attend each year? What would you say is your percentage of travel?
Sterling: While I don’t travel anywhere nearly as much as Jack Dongarra, I average about two and a half trips per month although peak travel can reach four in any given month. I limit the number of general conferences to four or five a year but attend half a dozen focused workshops a year which I find far more useful and productive. Of course then there are the plethora of program and projects meetings. I get much of my work done in the Admiral’s Club at DFW.
insideHPC: How do you keep up with what’s going on in the community, and what do you use as your own “HPC Crystal Ball?”
Sterling: Your electronic publication, and that of your competition, proves very useful in keeping up with the day by day incremental advances and offerings of the industrial community with some valuable information on academic near-term accomplishments as well.
At this time of phase change in the field, the rate of advancement is too fast to be adequately represented by conventional professional society journals. These are valuable for archiving, but not for timely communication. I find that direct contact with contributing scientists and institutions both at organized forums like workshops and through unstructured side-bar private communication. I am amused by the conventionality of technical program committees of even small workshops on focused topics and their frequent fear of including new work-in-progress research.
My “HPC Crystal Ball” is more of a lens focused in deep space at a narrow part of the sky rather than the total space. I exploit blinders to narrow my scope of interest. I could, of course, completely miss a major important development. But I use foundational challenges of the logic and physics of the space of concern to inform about future directions of the field. It doesn’t always work but it has provided a unique viewpoint.
insideHPC: There are people in our community that are motivated by the science and discovery that we enable others to make, and people that are motivated by the science and engineering of HPC itself. Where do you fall on that spectrum?
Sterling: Like many, the answer for me is “both” but in selective areas in either case.
On the end science and engineering side, two major classes of problems are of interest to me. The first are two examples of classical supercomputing that I feel are essential to the advancement of civilization. These are the development of controlled fusion and the control at the atomic level of molecular dynamics. In the first case, we are challenged by the sheer scale of computation required at Exascale and beyond. In the second case, we are challenged by the need for dramatic improvements in strong scaling.
The second class of problems that absolutely fascinates me is the broad family of dynamic directed graphs as applied to knowledge management including but not limited to machine intelligence. Of course I am strongly focused on your latter domain of the science and engineering of HPC itself. This is particularly the case as the leading edge of the community are now considering revolutionary hardware and software techniques to extend delivered performance in an era of flatlined clock rates and processor core design complexity. This is a very exciting time to be engaged in HPC system research.
insideHPC: What do you see as the most exciting possibility of what we can hope to accomplish over the next 5-10 years through the application of HPC.
Sterling: As excited as I am about the computers themselves, their true value lies in their role as the third pillar of science (complementing experimental observation and theoretical modeling). More than ever, civilization must rely on the strength of HPC in devising new methods to addressing major challenges of climate, energy, medicine, design search-space optimization, and national security applications.
But my real interest lies in a very different area of application: symbolic computing. I believe we are likely to encounter a renaissance in intelligent computing, not seen since the 1980s, because the need for intelligent systems is growing for knowledge management, data mining, real-time decision making, declarative human interfaces, robotics, target recognition, and many other problems that need to directly manipulate abstractions and their inter-relations rather than raw data. I am particularly intrigued by the new opportunities afforded for self-aware machine intelligence by the Petascale computing systems coming on line with hundreds of Terabytes of main memory and their concomitant memory bandwidth which is critical for effective symbolic computing. I believe that by the middle of the next decade (yes, 15 years from now) symbolic applications will compete with or exceed the demands for cycles of numeric intensive applications.
HAL, are you listening?
insideHPC: What do you see as the single biggest challenge we face over the next 5-10 years?
Sterling: Viva la Revolution! Or to paraphrase a worn-out expression: “It’s the execution model, stupid!”
We are at the leading edge of a phase change in HPC; the sixth by my count over a period of as many decades. As previously suggested, the last two decades have been dominated by the communicating sequential processes (CSP) model which has served well for both MPPs and commodity clusters. But with the forced reliance on multi/many core and the flirtation with GPU accelerators, the model is stretching passed the yield point. A new model of computation will become essential before the end of this decade when current technology trends will demand billion-way parallelism, latency hiding for tens of thousands of cycles, global address space that are not statically nailed to specific physical hardware, the ability to migrate flow control as easily as data, and increase of programmer productivity (it just shouldn’t be this hard).
For me the biggest challenge we face over the next 5-10 years is: What is the new model of computation for HPC to replace CSP? And given that answer, whatever that may prove to be (yes, I have my suspicions), how will such a model’s set of intrinsic governing principles influence the co-design of the new programming models (sorry, new languages guys, there is no way around it), runtime and operating systems, and most exciting of all, the new core architectures. HPC system development is going to be fun again.
insideHPC: Any final thoughts you would like to share with our readers?
Sterling: HPC is multi-faceted. I spend too much time out on the fringe pushing the performance limits, but have come to appreciate the challenge of strong scaling required to shorten the execution time of problems that already take far too long (many weeks and months) but cannot effectively employ anywhere near the available processing resources. This isn’t some future Exaflops problem, but real problems in AMR and molecular dynamics trying to be worked today. To serve these effectively will require a real change in how we build hardware core architectures because it is the inefficiencies in overhead, latency, and contention as well as poor use of available parallelism that is inhibiting better scaling for these and other current applications.
From this you may infer that “no” all the focus on Exascale is not the right thing, but many of the challenges to Exaflops performance of the future are the same as some of the challenges to strong scaling of problems in the present. The current generation of Petaflops machines, with the possible exception of Roadrunner, is really the end of the classical static CSP era. Core architectures of the future will have to incorporate hardware mechanisms that recognize the shared context of thousands or millions of like-cores rather than passing off all such interchange to the I/O subsystem which is not optimized for inter-thread cross-system interoperability.
Making a machine easier to use requires a machine that is easier to use and that is not what we have been building over the last two decades. Too many people, including those in sponsoring agencies, wish the problem to be tractable through uniquely software means. It is not a software problem, at least not exclusively. It is a multi-level system problem; yes, including software but only in conjunction with an enabling hardware solution as well. The new DARPA UHPC program recognizes this truth and is pushing for a holistic perspective towards innovative solutions. A consequence of that is that productivity will be enhanced and users will find systems easier targets to employ.
I will make an outrageous prediction: Exaflops systems in 2020 will be easier to use than Petaflops machines are in 2010.
This series is about the men and women who are changing the way the HPC community develops, deploys, and operates the supercomputers we build on behalf of scientists and engineers around the world and Ricky Kendall, this month’s HPC Rock Star, is at the center of enabling science on the largest computing systems the world has ever seen.
Kendall is the leader of the scientific computing group at one of the nation’s leading HPC facilities, the National Center for Computational Science at Oak Ridge National Laboratory, where he and his team help users get the most out of what is today the largest supercomputer in the world. But this isn’t a theoretical task for Kendall — he comes from the large scale application development trenches himself, having been part of the team that started NWChem, one of the leading community codes for computational chemistry. Kendall’s accomplishments put him in the center of the computational community, in a role we used to call a computational engineer when I was in graduate school. As he puts it, “The chemistry community often sees me as a computer jock, and the computer science community sees me as an applications person.”
Kendall is the kind of leader that the HPC community needs most: someone committed to making sure that the systems our community builds end up helping to move the world forward.
Ricky Kendall started his career as a staff scientist at Pacific Northwest National Laboratory where he was responsible for the development of computational chemistry in support of the waste remediation activities of the Environmental Molecular Sciences Laboratory (EMSL). Part of this work included development that eventually became the community chemistry code NWChem, an application that is in wide use today for a variety of problems of interest to the science and engineering communities.
But Kendall wasn’t solely focused on computational code development. During his time at PNNL he continued to develop his desire to help prepare the next generation of computational professionals by serving as an adjunct lecturer at Washington State University and working with high school students. Kendall says that the challenges were fun and rewarding, both for him and the students. “You learn a great deal when you have to explain things so that students can understand the topic,” he says. “You also learn what you thought was true may not be quite right.”
After leaving PNNL, Kendall headed to Ames Laboratory in Iowa where he served as a computational scientist. He took the teaching bug with him when he moved, and added an adjunct associate professorship at Iowa State University to his regular duties at the lab. In addition to developing his own understanding of the field, Kendall says that he also had the sense that he was filling a real need in our community. “At WSU and Iowa State University, the courses I mostly taught involved programming. I found that programming skills are not stressed by the CS curriculum at many schools, and felt I wanted to help students get those practical skills.” He also contributed to the strength of the HPC community directly by developing an HPC course at ISU. The course was geared toward learning different parallel programming models, which he says the students found challenging and useful, and which ultimately included students from aerospace engineering, chemistry, physics, and other departments across the campus.
As he was pursuing his “regular” job and keeping up with his teaching duties Kendall also found time to publish, and the list of his publications is impressive not only for sheer quantity, but for the diversity of topics which range from low level performance measurement to application and algorithm development. Kendall credits this unusual diversity with values instilled by his graduate advisor, “My advisor felt that students should have skills in both applications and theory and code development,” he explains, “and I found that I really liked doing the code development in addition to the application work. I find it rewarding being able to use a code I helped develop on the applications I’m interested in, knowing that the development was driven by the needs of the application space.”
It takes a village
Today at Oak Ridge, Kendall serves at the group leader for the Scientific Computing Group, a role that he describes as “definitely on the enablement side” of the computational spectrum. “My team’s focus is to help our users get the most out of the resources we have and plan to have at the facility. I have an amazingly talented team that does this job and we have been reasonably successful in integrating with our user community and getting codes to scale to the size of our Jaguar system.”
Kendall’s experiences with both education and mentoring and large scale application development make him uniquely suited to helping ORNL’s computational communities make effective use of systems like Jaguar, currently ranked #1 on the Top500 list of the world’s largest supercomputers. “For most of my career,” he says, “I have sat on the fence that separates applications guys and developers. The chemistry community often sees me as a computer jock, and the computer science community see’s me as an applications person.”
But this perspective is extremely useful, Kendall explains, because leadership-scale science is a multi-disciplinary effort. “Many of the most successful applications on leadership computing facilities today have multidisciplinary teams. These teams have someone that understands the theory being used, the mathematics, the algorithms, computational science at scale, programming skills and core computer science skills. All are needed to make the application work on the leadership systems and be potential candidates for future systems. The successful applications plan for change and have ways to deal with how hardware evolves.”
Two handshakes of separation
Multi-disciplinary teams of this kind are really communities, and even a quick glance at Kendall’s resume reveals a commitment to the HPC community that goes beyond teaching and education. “The best advice I got when starting down the development path,” he continues, “was to steal what you can and only write the parts you have to. I think that still holds. The trick is to make yourself aware of what others are doing and how you might leverage it.”
For Kendall a key part of being aware of what others are doing is involvement in community events like the SC conference series, for which he is serving as the Technical Program Chair as part of SC10. This is a huge job, and represents a significant commitment of time and energy above and beyond one’s day job and the rest of your life. I asked what drives him to put so much energy into what is, essentially, an optional activity. “There are many reasons to be involved in community efforts,” Kendall explains. “One is to help spread the word about the things you are doing as a scientist and as part of an organization. Another is the networking aspects of such involvement: you are no more than two handshakes away from anyone in the HPC community, and it’s important to make those connections for yourself, your students and your organization. In terms of building an organization and keeping it healthy, recruiting staff is an incredibly time consuming and interactive task. By being involved in such efforts as the SC conference you get a good feel of the overall community and help your recruiting efforts. You also learn what others are doing and can potentially leverage other activities in the community with your own scientific missions and goals. These kinds of grass roots connections can lead to collaborative efforts and new areas of research.”
As an educator, community leader, and technologist Kendall has already helped move the HPC community through many transition points. What does he see as our next significant challenges? “Software is one of the biggest challenges we face,” he explains. “Exascale software is likely not going to look the way applications look today. We are at a turning point, and where we go next is an open question. In general though to get to the exaflops scale we are going to have to focus more on programming in the node. The path forward here is getting more powerful nodes and lots of them. This means that as a community we will have to deal with multiple levels of concurrency and make that all work. This means that we will have to realistically bring together some of the old vector techniques, invent new many core techniques, and utilize the scale of the nodes all at the same time. There is no free lunch here, and there needs to be a lot of diversity available to the community to try different techniques and algorithms.”
Getting this kind of diversity into the efforts we pursue on the way to exascale is going to mean adding room in the process for failure, with many incremental steps and missteps on the way to the final destination. “I often describe scaling codes to large core counts as playing ‘whack-a-mole,’ because you find and eliminate a bottleneck to scaling and something else pops its head up. The path to exascale is going to be a multidimensional whack-a-mole with really ugly moles! Its going to be a lot of work but there will likely be some fun rolled in along the way as well.”
As long as I’m useful
Kendall describes his role today as a “glue person” helping to join applications and computer scientists on teams that do some of the most advanced computational simulations in the world. This is a role that Kendall relishes, incorporating staff development and mentoring along with a deep understanding of technology and applications domains. “I decided to take the job at ORNL to help build the leadership computing facility and my team along with the rest of the division and our sponsors have been able to deliver on that front. We have the #1 system on the Top500 list, and we were able to work with our users who got 3 applications doing science at above 1 Petaflops of performance. I enjoy the enabler role and will continue in that vein as long as I’m useful.”
Bill Kramer has spent his career finding, catalyzing, and managing change in HPC. Early in his career he helped field the first, production Unix-based supercomputer, and he has continued to work to design and commission some of the most innovative and successful computers of the past twenty years: during his career he has fielded twenty large supers, 7 of which have been in the top 5 of the Top500. Kramer’s career choices have always drawn him to our community’s leading organizations, places that were changing something fundamental about what it means to be a supercomputer center. But he isn’t about change just for the sake of change: for Kramer it is a way to make sure that he stays fresh, and does the best job he can for the people he is leading, and for the people who use his systems.
He is the kind of leader that the HPC community, and just about everyone else, needs more of: someone focused on service to a community he believes in and on getting the job done for the benefit of all.
Today Bill Kramer is the deputy project director and co-principal investigator for the Blue Waters project at the National Center for Supercomputing Applications (NCSA), at the University of Illinois in Urbana-Champaign. This is ground zero for the first sustained PFLOPS (10+ PFLOPS peak) supercomputing center dedicated to diverse science and engineering; but it’s not really about the computer. Over the past several years Bill and his team have been focused on building the facility and designing a system that, when finally turned on next year, will probably be the largest system for open science in the world. But if you’ve been following what the Blue Waters team has been doing you’ll see that they have taken a radically different approach to the launch of this capability into the community.
Getting the system fielded is only the beginning of their efforts, not the end. The really innovative things that the Blue Waters team are can be seen in their focus on training potential users, evangelizing the machine and its capabilities, and reaching out to new disciplines that should be able to benefit from the capability. In short, they are building a community around the resource: a community of users, architects, administrators, and developers that will work together and support one another once the machine is launched to, hopefully, conduct research that will change the world.
This is the perfect place for Bill Kramer. In talking with Kramer about his accomplishments, it is clear that he is one of those people who have driven their career paths with a guided purpose. As he describes it, the common thread across all of the places he’s been in his career is that they were all setting the pace for HPC at the time.
William T. C. Kramer, PhD, started his career at the University of Delaware supporting code development for the college of engineering. He helped develop applications and visualize datasets for the college’s various research projects on systems like DEC’s PDP-10 and VAX. Some of this work was on the systems side, working on device drivers and components of the operating system. This was the early days of Unix, and U Delaware was one of the early sites on the ARPANet. This put Kramer in a position to be hands deep in the Unix kernel, making systems work with the new TCP/IP. From there he moved on to systems management, getting exposure to both the human and technical issues in running large systems for scientific users.
After a while at Delaware, Kramer started sending blind resumes out to NASA centers. “I always thought NASA was cool,” Kramer says. NASA Ames was about to field a Cray-2, the first production Unix-based supercomputer in the world, and they needed someone who knew how to run a multi-user computer system and someone with the system experience to make it all work. This was the first of the moves Kramer made into an organization undergoing change. “NASA was building a supercomputing center from the ground up, and it was a very exciting time both in terms of the organization and the technology,” he says.
In fielding the Cray-2, Kramer helped finalize several pieces of software that would eventually become staples in HPC centers around the world, from UNICOS to NQS. Eventually he moved from system engineering to development and then leadership as Ames continued to field supercomputers from Cray (the site actually tried to install an ETA-10, but ended up refusing to accept the system because it never worked). They also started experimenting with MPPs, including TMC and Intel systems, and an early IBM SP. He remembers that one of the big debates they had during his time running the high speed processing group was whether or not to allow interactive editing on the Cray. His position — in favor of interactive editing — eventually won the day, but not for the reasons you’d expect. “We argued that it made more sense in terms of the demand on system resources for users to be able to make small edits to files directly on the Crays, instead of incurring all the overhead of transferring the complete file off and then back on to the system for a small change.”
Kramer was then recruited to NERSC, another organization in the midst of tremendous change. They had just moved from Livermore to Berkeley, and they had set out to become a different kind of supercomputing center. “NERSC was focused on big science — results — rather than on just having lots of users,” he says, and that was a difference that attracted him. NERSC was also one of the first organizations to commit 100% of their production resources (“in with both feet” is how he puts it) to MPP systems in a time when vector was the norm. While at NERSC Kramer contributed to the evolution of the Cray T3E, ultimately becoming Deputy Division Director as he fielded IBM SPs and, most recently, the Cray XT4 known as Franklin, before moving on to NCSA to run the Blue Waters project.
Throughout all of these very challenging assignments, Kramer has remained dedicated to volunteer service. “These are very symbiotic commitments,” he says. “Certainly the organizations benefit, and I enjoy giving back to the community. But volunteer assignments are a great way to refresh my point of view and to develop new skills that, sometimes, end up helping out professionally.” Kramer says that a lot of what he has learned about managing people has come from experience in volunteer organizations. Over the years he has served in SCUBA organizations and volunteered in schools and community theaters. He also helped start the tutorials effort and graphics special interest group of the Digital User’s Group, and has been active in SIGGRAPH. But people are probably most familiar with his service to the SC conference series, which included a year as General Chair of the Conference in 2005 when he hosted Microsoft Chairman Bill Gates on the stage in Seattle.
“I try very hard to make sure I don’t get staid in my ideas. Volunteering is a great way to learn about yourself, and find new things you like to do that challenge you.”
Kramer is at the point in his career where he has the perspective to identify, and to be proud of, a few key accomplishments. His list is interesting as much for the kinds of items it contains as for the specific items themselves: facilitating the first ab initio turbulence simulation at NASA Ames, and supporting the efforts to return to flight after the Challenger disaster, the first FAA certification of an aircraft change based solely on computation, and discoveries in the search for dark matter.
What is special about this list is that Kramer doesn’t include any of the contributions he made to machines, only to discoveries the machines made possible. Unlike many managers of supercomputing centers, including myself, Kramer has managed to stay connected to the work his machines make possible. “I have always tried to make sure I kept one technical activity to keep me connected to the work that supercomputers make possible.” This, he says, reminds him why he came to supercomputing in the first place, and makes him a better center manager.
In researching this article with Bill’s colleagues and co-workers, I continually received anecdotes of his “boundless energy” and “deep commitment” along with adjectives like “focused”, “tireless”, and “dedicated.” But how does he describe his own contribution? “I think the most value I bring is in making large, complex systems work well so that people can get something done with them.”
And that, in the end, is what an HPC rock star does.