DOE Under Secretary Dabbar’s Exascale Update: Frontier to Be First, Aurora to Be Monitored

As Exascale Day (October 18 [1018]) approaches, U.S. Department of Energy Under Secretary for Science Paul Dabbar has commented on the hottest exascale question of the day: which of the country’s first three systems will be stood up first? In a recent, far-reaching interview with us, Dabbar confirmed what has been expected for more than two months, that the first U.S. exascale system will not, as planned, be the Intel-powered Aurora system at Argonne National Laboratory. It will instead be HPE-Cray’s Frontier, powered by AMD CPUs and GPUs to be installed at Oak Ridge National Laboratory.

This has been generally anticipated since late July, when Intel disclosed that its “Ponte Vecchio” 7nm GPU, integral to the Aurora (A21) exascale system – Intel is prime contractor – will be delayed at least six months.

The under secretary’s Frontier confirmation came after an exascale status review last week in which Dabbar met with Oak Ridge Director Thomas Zacharia, Argonne Director Paul Kearns and other senior DOE and national lab scientific computing managers, along with HPE President and CEO Antonio Neri and  HPE GM of HPC and Mission Critical Solutions Pete Ungaro.

Referring to the labs (ORNL, Argonne and Lawrence Livermore, which will house the HPE-Cray-AMD El Capitan system) tabbed as exascale sites, Dabbar said, “They’re getting very close to the first machines that are going to be delivered next year. And the first one is going to be at Oak Ridge.” Which means Frontier.

“We were here today to make very specific updates on both the hardware and the operating system software stack for the machines…,” said Dabbar, who brings to his work at DOE a background in the military, investment banking and the sciences, “…where they are in the development cycle, a lot of the real details around the schedule and around completion of the different components and, ultimately, delivery of those machines.”

Paul Dabbar, DOE

The good news from DOE’s perspective – assuming Frontier is up and running next year – is that the U.S. will stand up an exascale system in 2021, in accordance with DOE’s schedule. The bad news is that Aurora will not be the first American exascale-class supercomputer, as originally planned. On Aurora’s status, Dabbar avoided specifics but said senior officials at DOE and at the labs are monitoring the situation.

“We are in discussions with Intel about that,” Dabbar said. “I think we’re feeling good about the overall machine. I can’t go through exactly all the different options that we’re looking at for the Argonne machine, but we have a good degree of confidence that not very long after the Oak Ridge machine – that will be delivered as part of a plan to have at least one (exascale) machine up in 2021 – but not long after that we will have the Aurora machine, the Argonne machine, also. But the details are still being identified about exactly what we’re going to go through with Intel and their microelectronics. But we have confidence that machine also will be delivered and will be delivered right behind Oak Ridge.”

Along with Frontier’s prospective delivery in 2021, another bright spot for the U.S. exascale effort is that the country’s first exascale systems will not be released into a vacuum, according to Dabbar. Citing the strategy of DOE’s Exascale Computing Project (ECP), the under secretary said the first three exaclass systems will be complemented by an applications and development ecosystem under construction for several years – in short, ECP’s notion of co-designed “capable exascale.”

“Our major partners have different components of the hardware and software stack,” said Dabbar, “…and one of the things that will be occurring with the exascale program is the Shasta software stack that is … already developed to a large degree by HPE Cray. That’s something they’re developing as part of running their system. They actually have deployed early versions of that. Some of their other machines that they that they have that are not exascale, so they already have the earlier versions that have been de-risked.

“A big part of the deployment … is really about the whole system,” he continued. “And so the software stack that is going to be riding on top of the hardware is going to be integrated. So a lot of the discussion today, and part of the deployment, is that layering of the operating system on top of the hardware.”

Dabbar cited another DOE bright note, one that encompasses the push for exascale: the department’s mission and work is a rare instance of bipartisan agreement among the country’s political leadership. Beyond the lack of rancor surrounding DOE activities, Dabbar pointed to budgets: “off-site spending” for the national lab complex and grants for the university system is up 32 percent over the last  three and a half years, he said, and this financial support extends to other scientific areas, including NASA and NIH.

“There’s been a pretty broad consensus around the importance of discovery, around science, around the economic impact for the country, and jobs and R&D,” he said. “In general, they were under-invested in for quite a while, and things have changed in a very bipartisan manner. The President signing all-time high support budgets for discovery has been wonderful for the sector, in terms of funding… I think this broad area of DOE and the national labs have been certainly at the high end of the additional support, and our budgets have been passed by wide bipartisan support and signed by the President.”

Part of that bipartisanship is tied, of course, to support for U.S. economic and military competitiveness, which relates directly to the country’s technological standing vis a vis Europe, Japan and, in particular, China. Yet Dabbar discouraged the notion that the U.S. is in “a race to exascale.”

“We don’t really focus on the specifics of what is going to be deployed first and when in the world,” he said. “I don’t focus on that data. And certainly, we congratulated Riken and the Japanese for their accomplishment (Fugaku, the current top-ranked supercomputer), and we obviously work significantly with Japan on all sorts of things in the sciences…

“What we focus on is what we think we can achieve in terms of (compute) performance and the impact that performance can have on discovery,” said Dabbar. “We know, in general, the computing sector in the US and microelectronics chips, the US leads the world. So we should be number one. And if we’re not, we’re not putting the right focus by DOE and the federal government. Because our community, the private sector and academia, is driving microelectronics and architectures.”