DEEP-ER Project Paves the Way to Future Supercomputers

Print Friendly, PDF & Email

The good folks from the DEEP-ER team in Europe report that their multi-year project has come to a successful conclusion. Funded by the European Union, DEEP-ER is the short name for Dynamical Exascale Entry Platform – Extended Reach.

The DEEP-ER project has created far-reaching impact. Its results have led to widespread innovation and substantially reinforced the position of European industry and academia in HPC. We are more than happy that we are granted the opportunity to continue our DEEP projects journey and generalize the Cluster-Booster approach to create a truly Modular Supercomputing system,” says Prof. Dr. Thomas Lippert, Head of Jülich Supercomputing Centre and Scientific Coordinator of the DEEP-ER project.

The DEEP-ER project has successfully updated and improved the Cluster-Booster architecture first introduced by its predecessor DEEP. The next step in the DEEP projects roadmap is the generalization of this concept towards a “Modular Supercomputer Architecture”. It will include further compute modules to support workloads that emerge from the confluence of HPC and Big Data Analytics.

The results from the project include:

  • Its highly efficient and scalable I/O and resiliency software and the tight integration with novel memory and storage technologies solve long-standing problems with large HPC systems, and the highly innovative system prototype promises new levels of performance and energy efficiency. Seven important European scientific and engineering applications have validated the DEEP-ER concept. These remarkable results were achieved in a close co-design collaboration between all hardware, software and application experts in the project.
  • The DEEP-ER system prototype implements the Cluster-Booster architecture first introduced in DEEP with leading-edge CPU and interconnect technology, adds novel non-volatile storage class memory (NVM) and introduces network-attached memory devices. The Cluster part uses Intel Xeon processors, and the custom-designed Booster employs the 2nd generation Intel Xeon Phi CPUs. The Booster integration is based on Eurotech’s Aurora line, which ensures high density and energy efficient hot water cooling tested for inlet water temperatures of up to 50 ˚C. The EXTOLL TOURMALET interconnect fabric (entirely developed in Europe) delivers leading network bandwidths and latencies.
  • An integrated parallel I/O system provides applications with a choice of highly optimized and scalable I/O interfaces: the BeeGFS parallel file system, the parallel I/O library SIONlib and the Exascale10 MPI-I/O optimizations fully leverage the fast local and network-attached storage and minimise traffic to the global storage system, thereby improving both I/O efficiency and scalability.
  • Resiliency is a growing concern due to the high number of components required by large HPC systems. DEEP-ER addresses this by significant improvements in creating application checkpoints and restarting an application after a system fault. The local NVM and the network-attached memory enable very fast creation of checkpoints, a redundancy scheme (“buddy checkpointing”) protects the data in case of node crashes, and a novel task-based checkpointing scheme, based on the OmpSs programming model and the resiliency-related extensions of ParaStation MPI, supports fine-grained recovery of parallel application tasks.
  • All in all, seven key European HPC codes from science and engineering have guided the R&D efforts. In a co-design cycle, their specific requirements did drive the DEEP-ER system and software architecture and design, while the existing codes were modernised to significantly improve performance and scalability and adapted to the DEEP-ER I/O and resiliency interfaces. As a result, the DEEP-ER applications now enable faster scientific discovery and better engineering solutions with greatly reduced energy use and lower costs, which benefits European research and industry alike.

Sign up for our insideHPC Newsletter