This week Swiss CSCS announced a significant upgrade to the Piz Daint supercomputer in Lugano. As the first Cray XC30 system with Kepler GPUs, the 750 Teraflop Piz Daint is now able to a run climate simulations over three times faster than on previous systems.
To learn more, I caught up with Prof. Thomas Schulthess from CSCS, one of our esteemed Rock Stars of HPC.
insideHPC: What design elements were you looking for when you chose the Cray hybrid architecture?
Thomas Schulthess: Since we co-designed the systems along with important applications from our user program, we did not know at the outset whether we would build a multi-core system or a hybrid system with GPU or MIC. Therefore, the architecture had to allow for flexible node design and include a scalable network with sufficient performance in terms global and injection bandwidth – and by scalable I mean performance as well as price. Furthermore, we needed a programming environment that is compatible with what developers have on their laptops while still providing optimal performance at scale. Cray was able to provide all this, and they would collaborate in a rigorous evaluation of different node designs and processor types. This is how we ended up with the Cray hybrid architecture.
insideHPC: CSCS was an early adopter of GPUs for scientific computing. Has that investment paid off in terms of performance on real applications and energy efficiency?
Thomas Schulthess: We have adopted GPUs and MIC early on in prototypes and for application development. In all science domains of our user program this has paid off. Take for example COSMO that implements a regional climate model used by scientists at ETH Zurich as well as the meteorological service. This code has been optimized and works very well on our current multi-core systems. The newly refactored and even more optimized code will run in excess of a factor three faster on the hybrid systems (using the same number of nodes), and the energy to solution will be one seventh of that used in the same simulations running presently on our multi-core system. It is nice to see the performance going up, while solving real problems, and at the same time the power consumption is going the other way.
insideHPC: What area of code development for these hybrid systems remains the most challenging?
Thomas Schulthess: Hybrid systems force application programmers to think about algorithms and how to expose parallelism while maximizing data locality in computations. This is what we should be doing anyway as we move code development forward toward exascale. The challenge in my view is not in any specific area, but generally making sure the algorithmic investments in hybrid codes make their way into tools, in order to make them useable in other applications that build on the same motifs. We have to get out of the unsustainable software engineering situation in many domains of HPC, where today we optimize each code individually.
insideHPC: Two of the six Gordon Bell Award candidates this year are from Switzerland. Do you think this is reflective of Swiss HPC leadership?
Thomas Schulthess: The Gordon Bell Award is such a narrow window into HPC that I’m not sure it is reflective of the field or leadership thereof. The two papers you refer to represent the highest end of what can be accomplished with leadership systems at the present time. In terms of broader-impact HPC, we have been running many significant development projects in Switzerland that would not be submitted to the Gordon Bell track at the SC conference. This year’s list of Gordon Bell finalists, however, is definitely reflective of U.S. leadership in supercomputing systems, despite of what the Top500 list may tell you. All six finalists were using one of the three leadership systems Titan, Sequoia, and Mira from U.S. Department of Energy labs.
insideHPC: The computer center for Henry Markram’s Blue Brain Project will be in Switzerland. Can you tell me more about where that project stands today?
Thomas Schulthess: The new supercomputer of Henry Markram’s Blue Brain project just completed acceptance at CSCS last week. It is a Blue Gene Q system, and we are developing new strategies for dealing with the monumental memory requirements of cellular brain simulations in a three-way collaboration between EPFL, ETH Zurich/CSCS, and IBM Zurich Lab. The new Blue Brain supercomputer will be the development system of the Human Brain Project (HPB), which will formally start its ramp up phase next month. The large production system of the HBP will be built and operated at Jülich Supercomputing Center in Germany. Barcelona Supercomputing Center and CINECA will operate HBP systems for molecular simulations and data analytics, respectively. The kickoff meeting of the HBP will be the week of October 7 in Lausanne, Switzerland.
For more details on the Piz Daint supercomputer upgrade, read the Full Story.