What is the ECP Exascale Computing Project?

ECP1The Exascale Computing Project (ECP) is a collaborative effort of two US Department of Energy (DOE) organizations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA). As part of President Obama’s National Strategic Computing initiative, ECP was established to develop a new class of high-performance computing systems whose power will be measured in exaflops (1018 floating point operations per second), or a thousand times more powerful than today’s petaflop machines. ECP’s work encompasses applications, system software, hardware technologies and architectures, and workforce development to meet the scientific and national security mission needs of DOE.

The goal of ECP is to deliver breakthrough modeling and simulation solutions that analyze more data in less time, providing insights and answers to the most critical US challenges in scientific discovery, energy assurance, economic competitiveness, and national security.

DOE formalized this long-term strategic effort under the guidance of key leaders from six of the major DOE and NNSA national laboratories: Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Sandia.

In addition to implementing the National Strategic Computing Initiative, the ECP plays an important role in driving US technological competitiveness amid the convergence of HPC, big data analytics and machine learning, topics that ECP-funded research and development efforts will impact across the spectrum of science and engineering domains and disciplines

It is important to note that the ECP is not just a project to build extremely fast, large capacity supercomputers. The ECP addresses hardware, software, applications, platforms, and workforce development critical to the effective use of exascale computing environments.

ECP’s focus is on “capable” exascale systems. That means hardware, software, applications, platforms and facilities are co-designed and integrated to deliver sustained performance supporting key DOE missions and contributing to US economic competitiveness.

In this video from the 2016 Argonne Training Program on Extreme-Scale Computing, Paul Messina from Argonne presents: A Path to Capable Exascale Computing. Download the Slides (PDF).

What is a ‘capable’ exascale system?

A capable exascale system is defined as a supercomputer that:

  • Solves science problems 50 times faster (or more complex) than on the 20 petaFLOPS systems (Titan, Sequoia) of today. (A petaFLOP is defined as a unit of computing speed equal to one thousand million–or 1015–floating-point operations per second);
  • Operates in a power envelope of 20-30 megawatts;
  • Is sufficiently resilient that user intervention due to hardware or system faults is minimized;
  • Has a software stack that meets the needs of a broad spectrum of applications and workloads.

The ECP’s goals are to:

  • Develop a broad set of modeling and simulation applications that meet the requirements of the scientific, engineering, and nuclear security programs of the DOE
  • Develop a productive exascale capability in the US by no later than 2023, including the required software and hardware technologies;
  • Prepare two or more DOE Office of Science and NNSA facilities to house this capability;
  • Maximize the benefits of HPC for US economic competitiveness and scientific discovery.

How does ECP add value to what the DOE laboratories already are doing in terms of using HPC to advance scientific discovery?

The ECP leads the formalized project management and integration processes that bridge and align the resources of the DOE and NNSA laboratories, allowing them to more effectively work with industry. This includes integration with technology and system vendors and software and application developers that goes beyond the specific needs and charters of any one laboratory. The ECP leadership team, comprised of some of the most senior technology leaders of the DOE and NNSA HPC communities, is chartered with managing this complex, multi-year project. Their job is to take full advantage of existing infrastructure when feasible and to maximize project efficiency by managing resources and investments while accelerating research and development.

The Exascale Computing Project offers a rare opportunity to advance all elements of the HPC ecosystem in unison,” ECP Director Paul Messina said. “Co-design and integration of hardware, software, applications and platforms, a strategic imperative of the ECP, is essential to deploying exascale-class systems that will meet the future requirements of the scientific communities these systems will serve.”

Why is ECP needed?

American leadership in HPC is being challenged as never before, and the stakes are high. The new computing technologies required to achieve exascale will eventually make their way into consumer products and the services that enhance US global economic competitiveness and improve our quality of life. The ECP provides a leadership team with HPC technology and complex project management expertise to ensure a coordinated, collaborative approach to defining and developing necessary future exascale ecosystems, maximizing the return on the nation’s investment in the computing that underpins scientific advancement, national security, and economic well-being.

How is the ECP funded?

DOE has a long history of supporting high-end computing system acquisitions at its national laboratories through the DOE ASCR (Advanced Scientific Computing Research) and NNSA ASC (Advanced Simulation and Computing) programs. With ECP, the DOE Office of Science and the DOE NNSA are jointly funding a coordinated multi-lab effort to avoid duplication, maximize efficiency and drive significant new efforts in terms of application readiness, hardware and software co-design, and workforce development.

What is ECP’s Relationship to the National Strategic Computing Initiative

DOE, along with the Department of Defense and the National Science Foundation, co-leads the National Strategic Computing Initiative. Within DOE, the Office of Science and the National Nuclear Security Administration execute the ECP, which is the primary DOE contribution to the NSCI.

How is the ECP structured?

The ECP is a 10-year project led by six DOE and NNSA laboratories and executed in collaboration with academia and industry. The ECP leadership team has staff from the six labs, but additional staff from most of the 17 DOE national laboratories will participate in the project.

Will the ECP lead the procurement of the nation’s first exascale supercomputers?

The procurement of future exascale-class supercomputers for the DOE and NNSA laboratories will be handled under the same base facility programs in place today, a process familiar to most HPC system and software suppliers. Prior to the procurement phase, the ECP team will help to establish the design, performance and implementation requirements of these future systems. ECP will play a key role in determining the requirements for hardware, software, applications, and facilities that will be reflected in the exascale Request for Proposal (RFP) documents.

SUMMARY

The ECP will also play a key role in helping to drive new training programs throughout the US HPC ecosystem to prepare application developers, researchers and scientists to take full advantage of future generation exascale environments.

The elements of co-design that impact hardware and software development, a major effort on enhancing application readiness, and an expansive user training effort are unique aspects of what the ECP will contribute to bringing the US to the forefront of the exascale computing era.

Sign up for our insideHPC Newsletter