The Strategic Foundation of Exascale Computing in the USA

Print Friendly, PDF & Email

Rich Brueckner, President, insideHPC

Exascale computing continues to work its way into the consciousness of wider and more diverse audiences, creating demand for more explanation of what it is and what the U.S. initiatives are that have forged the strategic foundation for making exascale in the U.S. a reality.

Two terms that often land in the middle of these conversations begging for some clarification are the Exascale Computing Initiative (ECI) and the Exascale Computing Project (ECP).

I frequently talk with colleagues and members of the High Performance Computing community, including other reporters and industry analysts, who tell me they are confused about how – and whether – ECI and ECP align. This is an interesting topic and one I believe we need to address as we move closer to the realization of the nation’s first exascale platforms.

The readers of insideHPC should be quite familiar with the ECP as we have covered the project extensively over the past several years. In addition, I think it is fair to say that most people who follow exascale are at least familiar with the ECP. However, when people hear about the ECP, they assume the project refers to everything exascale – including the standing up of the first systems. In fact, this is not the case, as I will explain in this article.

2016: The Exascale Computing Initiative (ECI)

This is where most people get confused. The ECI came into existence in 2016 as a partnership between two Department of Energy (DOE) organizations, the Office of Science and the National Nuclear Security Administration. In scanning multiple documents and various public presentations, we can learn that the founding charter for the ECI was to accelerate research, development, acquisition, and deployment projects for the purpose of delivering exascale computing capabilities to the DOE labs by the early to mid 2020s.

That description in itself deserves a closer look.

The term “exascale computing capabilities” is rather vague. In the context of the ECI, it seems to refer to a full solution set – exascale systems or platforms, along with applications, software, facilities such as the DOE lab HPC centers, and even workforce development (recruiting, training, retention) – everything that needs to come together to achieve useful, productive exascale computing at the U.S. DOE labs.

Now here is what I think most of us have missed. To understand the relationship between the ECI and the ECP, we need only look at the mission components of the ECI.

The ECI is structured around four pillars:

  1. Investments to support DOE facility site preparation for pre-exascale and exascale platforms (self-explanatory – facility upgrades necessary for the future systems.)
  2. Support for non-recurring engineering (NRE) activities with selected computer manufacturers to advance hardware roadmaps as necessary for delivery of exascale systems (funding hardware advancements that vendors might not have the incentive to do in this timeframe based on current market demand).
  3. The procurement of exascale-class systems (large HPC systems are procured by the labs that become assets belonging to those facilities.)
  4. The Exascale Computing Project (ECP)

So, just to be clear, yes, the ECP is one of the four pillars or components of the ECI. I think for many folks, that’s clarification of a point that has typically been overlooked.

And it’s no wonder. Try searching for information on the Exascale Computing Initiative. You will find results such as a briefing from 2013,  where the Exascale Computing Initiative is referenced on the last slide. There is no public website that we’ve been able to find on the ECI.

So, only by drawing from various presentations and discussions, we’ve been able to connect some of these dots. ECI includes facility investments in site preparations and non-recurring engineering (NRE) activities needed for delivery of early to mid-2020s exascale systems, funded by the Office of Science/ASCR (Advanced Scientific Computing Research) and National Nuclear Security Administration/ASC (Advanced Simulation and Computing) programs. These investments in facility site preparation and NRE activities are under the domain of the ECI, not the ECP.

The take away from this: the ECP is not responsible for the procurement and standing up of the exascale systems. That responsibility falls on the labs directly.

The Exascale Computing Project (ECP)

Beginning in FY 2016, funding from ASCR and ASC was transferred to the ECP.

The ECP funding from ASCR covers research and development activities in applications, and in partnership with NNSA/ASC, includes investments in software and hardware technology (components – not systems) and co-design as deemed necessary to develop functional, capable exascale computers.

But remember, as previously stated, the ECP is jointly funded by two sponsors – the Office of Science through the ASCR program, and the NNSA through the ASC program.

The NNSA/ASC Advanced Technology Development and Mitigation (ATDM) program also supports ECP’s development of exascale applications and, in collaboration with ASCR, provides investments in software and hardware technology and co-design activities deemed necessary for the effective use of exascale systems for national security applications.

The ECP, unlike the parent program, ECI, has tremendous worldwide visibility. The ECP often uses the term “ecosystem” to describe their mission – “accelerating delivery of a capable exascale computing ecosystem for breakthroughs in scientific discovery, energy assurance, economic competitiveness, and national security.”

I think that resonates well with the HPC community and likely with most other audiences just starting to pay attention to HPC and wanting to learn about exascale. It makes a distinctive point.

Historically, HPC has been able to brttping supercomputers to market with the ecosystem lagging behind by months and even years. In their most recent newsletter, the ECP describes ecosystem this way:

The exascale ecosystem encompasses exascale computing systems, high-end data capabilities, efficient software at scale, libraries, tools, and other capabilities.

At the risk of oversimplifying all of this, if you think of investments in facilities and system procurements, that’s a part of the bigger picture, ECI.

If you think about the ecosystem as described above to enable functional, highly capable exascale computing in the U.S., then that’s the domain of the ECP.

If the ECP is successful with their mission, the U.S. will make a huge strategic leap forward, with fully functional, capable exascale systems with the necessary, supporting ecosystem in place enabling these new systems to be functional from day one. That will be a win for U.S. technology leadership, the U.S. industrial sector, and hopefully, the U.S. economy.

About the author:

Recently named as one of the Top 20 Big Data Influencers by Forbes Magazine, Rich Brueckner is an avid writer, publisher, and technology pundit focused on high performance computing. He acquired inside-HPC.com in 2010 and has since expanded his online publications to include inside-BigData, inside-Startups, and The Exascale Report. With over 25 years of HPC experience at Cray Research, SGI, and Sun Microsystems, Rich is known to many in the industry as “the guy in the Red Hat.”

When he’s not working, Rich keeps  busy writing science fiction, cartoons, and parody films. You can check out his stories: Angels of Silence, The Observer EffectThe Three Magi of KatrinaSeven Meals from ChaosFriends of the FallenThe Guardian’s End, and Ghosts of the Indian HerbHe has also penned some short film scripts including: Jigsaw Falling into Place and BardoIn non-fiction, Rich contributed the Foreword to Dark Matter, Dark Energy, Dark Gravity, a book about the Big Bang by Dr. Stephen Perrenod. He also wrote the Foreword to 72 Beautiful Galaxies, an interactive iBook by Dr. Stephen Perrenod.

Sign up for our insideHPC Newsletter