The PRACE ISC Award goes to: Sustained Petascale Performance of Seismic Simulations with SeisSol on SuperMUC, which will be presented on Monday, June 23 at 1:15 pm.
Seismic simulations in realistic 3D Earth models require peta- or even exascale compute power to capture small-scale features of high relevance for scientific and industrial applications. In this paper, we present optimizations of SeisSol — a seismic wave propagation solver based on the Arbitrary high-order accurate DERivative (ADER) Discontinuous Galerkin method on fully adaptive, unstructured tetrahedral meshes — to run simulations under production conditions at petascale performance. Improvements cover the entire simulation chain: from an enhanced ADER time integration via highly scalable routines for mesh input up to hardware-aware optimization of the innermost sparse-/dense-matrix kernels. Strong and weak scaling studies on the SuperMUC machine demonstrated up to 90% parallel efficiency and 45% floating point peak efficiency on 147k cores. For a simulation under production conditions (10^8 grid cells, 4.8*10^10 degrees of freedom, 5 seconds simulated time), we achieved a sustained performance of 1.09~PFLOPS.
The paper was authored by Alexander Breuer, TU München; Alexander Heinecke, TU München; Sebastian Rettenberger; TU München; Michael Bader, TU München; Alice-Agnes Gabriel, LMU München; Christian Pelties, LMU München.
The Gauss Award goes to: Exascale Radio Astronomy: Can We Ride the Technology Wave? The paper will be presented on Monday, June 23 at 2:15 pm.
The Square Kilometre Array (SKA) will be the most sensitive radio telescope in the world. This unprecedented sensitivity will be achieved by combining and analyzing signals from 262,144 antennas and 350 dishes at a raw datarate of petabits per second. The processing pipeline to create useful astronomical data will require exa-operations per second, at a very limited power budget. We analyze the compute, memory and bandwidth requirements for the key algorithms used in the SKA. By studying their implementation on existing platforms, we show that most algorithms have properties that map ineciently on current hardware, such as a low compute-bandwidth ratio and complex arithmetic. In addition, we estimate the power breakdown on GPUs and the cache behavior on CPUs, and discuss possible improvements. This work is complemented with an analysis of supercomputer trends, which demonstrates that current eorts to use commercial o-the-shelf accelerators results in a two to three times smaller improvement in compute capabilities and power eciency than custom built machines. We conclude that waiting for new technology to arrive will not give us the instruments currently planned in 2018: one or two orders of magnitude better power eciency and compute capabilities are required. Novel hardware and system architectures, to match the needs and features of this unique project, must be developed.
The paper was authored by Erik Vermij, IBM Research; Leandro Fiorin, IBM Research; Christoph Hagleitner, IBM Research; Koen Bertels, Delft University of Technology.
Do you have an HPC event coming up? Be sure to fill the seats by listing it on our insideHPC Events Calendar.