Results are now in from Extreme Scaling Workshops held recently at the Gauss Centres for Supercomputing in Germany. With 20 participating teams, the workshops were designed to improve the computational efficiency of applications by expanding their parallel scalability across the hundreds of thousands of compute cores of the GCS supercomputers JUQUEEN and SuperMUC.
More and more users jump at the opportunity to fully exploit the massive potential of our petascale HPC infrastructure by making their applications scale across the major parts of our HPC systems or even the entire machines. This is what our infrastructure has been created for. Our systems’ capabilities provide our users unique opportunities to run calculations of dimensions that never could have been tackled before”, emphasizes Professor Thomas Lippert, GCS Chairman of the Board and Director of the JSC. In addition to this, Professor Arndt Bode, Director of the LRZ, points out: “Software and application developers, too, are recognizing the opportunities offered by our state-of-the-art HPC technologies and are adapting their programs and codes to fully leverage the massive compute power of our Tier-0 infrastructure. This not only results in reduced computing time but also in verifiable factors such as lowered energy consumption per compute run and consequently in reduced operational costs–aspects that must not be neglected given our continued pursuit of improved energy-efficient supercomputing.”
Following on the success of previous years, GCS centres had again set the stage for their most ambitious HPC users to investigate and improve the scalability of their applications. For a limited number of days, the centres’ Tier-0 systems were entirely freed from their default daily compute business and made available exclusively to the participants of the extreme scaling workshops. Moreover, to enable best possible results, additional support was provided by hardware and software specialists as well as by HPC experts of the centres. The achieved results are worth the efforts:
- Program VERTEX of the Max Planck Institute for Astrophysics, Garching/Munich, a code used to simulate supernovae explosions, had started out on 7,360 compute cores on SuperMUC with measured 53 seconds per compute step. After the successful code optimization, the run time per compute step now running parallel on SuperMUC’s 144,000 compute cores merely measured 3 seconds–an almost twentyfold improvement.
- Code_Saturne of UK’s Daresbury Lab, a computational fluid dynamics (CFD) tool chain for billion-cell calculations, scaled two preconditioner+solver configurations to 1.75 million threads on JUQUEEN. An older version of the code using only MPI had scaled to 1.5 million processes, however, the latest version combining MPI+OpenMP is twice as scalable.
- Seven-League Hydro Code of the Heidelberg Institute for Theoretical Studies (HITS), a code for multidimensional simulations of hydrodynamics in stellar evolution, was scaled beyond its previous maximum of 8 racks to all 28 racks of JUQUEEN with 1.75 million threads.