The Atos Perspective for Exascale – A Race Beyond 10^18

Print Friendly, PDF & Email

Sponsored Post

The promise of Exascale – the next big thing in HPC

The incessant pursuit of scientific and research discovery as well as the advent of new technology, e. g. Artificial Intelligence (AI), is driving the need for HPC (High-Performance Computing). On one hand, the data deluge is growing at an exponential rate and on the other hand, the complexity of the data is putting pressure on the conventional digital simulation. The need for more computing power and the emergence of silicon innovation makes Exascale or 1018 Flops an achievable and feasible near-term target. The Exascale machines will make a difference in specific areas, e. g. biological and medical research, (vaccine discovery), climate/weather forecast, material science for new materials addressing technology challenges (better photovoltaic, superconductors…).  Exascale will allow researchers and scientists to predict complex parameters and their related interactions with unprecedented detail and accuracy, enabling much higher fidelity and more predictive simulations at reduced time. These highly elevated simulations will generate unparalleled innovation breakthroughs, making a profound impact on science, economy, and society.

Exascale entails multiple challenges than achieving exaflop performance

  • Performance: When trendsetters noticed that AI could be done more effectively with graphics processing units (GPUs), a dramatic rise in processor research and development has been triggered. There is a now wide range of processing units in the market, e. g. IPUs (Intelligent Processing Units), TPUs (Tensor Processing Units), and FPGAs (Field Programmable Gate Arrays) are accelerating performance [1]. While these innovations are the driving force behind the performance, they also force us to work on heterogeneous computing architecture accommodating the multi-technology processors with augmented interconnect, storage, data management, and cooling systems.
  • Artificial Intelligence: AI is on its way to becoming mainstream in various applications across industry and academics. The convergence between HPC and AI brings numerous benefits. On the hardware side, GPU/IPUs are powerful accelerators to achieve Exascale performance, bringing unprecedented performance for various HPC centers today. On the software side, AI can be embedded in the HPC software management layer for enhanced applications performance. Furthermore, leveraging HPC to run training for ML/DL in AI applications helps clients to accelerate the time to simulate complex data and reduce time-to-insight. AI is also raising other issues such as ethics, but also reliability, robustness, and scalability.
  • Quantum computing: Quantum is moving from hype towards reality. Almost 2 years ago, we thought that we needed perfect qubits to carry out calculations. As we need a huge number of physical qubits to make a perfect qubit, and it’s very difficult to integrate them on the same chip, we thought that we need more time to manage this. Since then, it’s emerged what’s called NISQ (Noisy intermediate-scale quantum computing). We can obtain results, especially by using hybrid algorithms in which one part is optimized for noise resistance and runs on a quantum chip, while the other part runs on a conventional chip [2]. Quantum computing will be a paradigm shift and will be a powerful complement to HPC in the Exascale arena. How to leverage qubit without compromises and how to architect hybrid HPC-Quantum supercomputing to leverage the best from both domains?
  • Energy consumption and carbon footprint: Exascale also means delivering more on Flops/Watt. This implies not only designing and building the optimized system architecture with cutting-edge processors or accelerators, but also bearing in mind that the software layer is critical in cluster management and application efficiency. In HPC, the energy consumed for a given simulation result at a reduced duration is the most meaningful benchmark to a client. Furthermore, cooling systems which could represent a big portion of the total energy bill are also critical to ensure processors or accelerators can work properly at their desired temperature environment. Managing cooling systems’ energy consumption is essential to both the electricity bill and the 1018 performance target.

A real environment-friendly HPC also means a tight control in carbon footprint in the whole HPC life cycle, starting from the selection of its raw materials, manufacturing, to end-of-life management. Green HPC is a critical element in achieving the Atos’ net-zero objective in 2028.

  • Security: while digital security stays high in all customers’ agendas, the challenges and requirements raise the HPC security bar even higher. Not only because that the HPC simulation are predominantly used for processing sensitive or highly confidential data, associated with government or state sovereignty, but also because the implementation of certain HPC security measures could compromise performance, which is the fundamental value proposition for an HPC. Thus, beyond the cyberattack, e. g. malware or ransomware threats, hardware and software vulnerability, HPC clients need to take into account more parameters to ensure the best trade-off between security and Exascale performance.
  • HPC-in-the-Cloud: The high-performance computing (HPC) community has generally been slow to adopt cloud compared to other industry sectors for several well-documented reasons: e. g. compromised performance due to virtualized compute instances; application availability and difficulty with licensing; the complexities of setting up an HPC environment within a public cloud framework; requirements for low latency interconnect and fast, parallel file systems and security concerns. However, today many of these barriers to adoption have been removed or mitigated [3]. According to industry analysts, the HPC cloud is now the fastest-growing segment of the HPC market. Organizations require more flexibility and simplicity in the way they provide end-users with HPC computing resources. The adoption of hybrid HPC, which means that on-premises and HPC cloud will co-exist and complement each other in the Exascale journey.

Atos – embracing Exascale, deliver beyond the “1018

Exascale means more than 1018 or exaflops performance. For Atos, Exascale signifies delivering reliable, accurate, federated, and secured simulations anywhere (on-premises or in-the-cloud) at a reduced carbon footprint at exaflopic scale. Data is a crucial asset to society. The potential behind the data – the ultimate actionable insight generated from data, is the eventual game-changer in scientific and industry innovations and breakthroughs.

Stay tuned for our next paper in articulating how Atos embraces Exascale, turning constraints into objectives to unleash the full potential of the 1018  performance!

About the Author

Agnès Boudot, SVP, Head of HPC, AI & Quantum at Atos.

 

 

[1] “An unprecedented wave of innovations in processors” – Philippe Duluc

[2] “An unprecedented wave of innovations in processors” – Philippe Duluc

[3] “Delivering on the promise of hybrid multi cloud for HP” – Andy Grant