At SC20: Intel Provides Aurora Update as Argonne Developers Use Intel Xe-HP GPUs in Lieu of ‘Ponte Vecchio’

Editor’s Note: the following story is an update of an earlier version published yesterday.

Intel and Argonne National Laboratory have announced they are using GPUs based on Intel’s Xe-HP microarchitecture and Intel oneAPI toolkits for development of scientific applications to be used on the Aurora exascale system — in anticipation of later delivery of Intel 7nm “Ponte Vecchio” GPUs, which will drive Aurora when the delayed system is deployed in 2022. The goal: using Intel heterogeneous computing programming environments so that scientific applications are ready for the scale and architecture of the Aurora supercomputer at deployment.

In an interview today, Jeff McVeigh, Intel VP/GM of data center XPU products and solutions, updated us on developments at Intel with direct bearing on Aurora, including the projected delivery of Ponte Vecchio (unchanged); on Aurora’s deployment (sooner than forecast yesterday by industry analyst firm Hyperion Research); on Intel’s “XPU” cross-architecture strategy and its impact on Aurora application development work ongoing at Argonne; and on the upcoming release of the first production version of oneAPI (next month), Intel’s cross-architecture programming model for CPUs, GPUs, FPGAs and other accelerators.

The Aurora system (A21), for which Intel is the prime contractor, had been scheduled for deployment in 2021 and was intended to be the first exascale-class system in the U.S. But in July, Intel disclosed that Ponte Vecchio would be delayed at least six months, meaning that HPE-Cray’s Frontier, powered by AMD CPUs and GPUs and designated for Oak Ridge National Laboratory will the country’s first exascale system.

Yesterday, at its HPC Market Update, industry analyst firm Hyperion Research said it expects Aurora to be delivered about 12 months behind schedule. But McVeigh told us today those expectations are pessimistic.

“We didn’t agree with what they said. I think they said delivery in 2022 and then it comes online in 2023,” McVeigh said. “I think that that was a little bit aggressive on their part, aggressive meaning we anticipate it being sooner.”

Regarding Ponte Vecchio, McVeigh said Intel’s shipment projections are unchanged since its statement in late July.

“We haven’t modified from that, which was sort of a window of the end of 2021 to the first half of 2022 timeframe,” said McVeigh. “I think that’s still the window that we’re describing for that work. We’ve still got work to do. Obviously, that’s more than a year from now, but we’re pretty pleased with some of the modifications we did based on the changes back in in the July timeframe. So I won’t give any more public information at this stage on it, but we’re happy where that’s proceeding.”

He also did not add to Intel’s previous statements regarding the possibility of outsourcing Ponte Vecchio’s fabrication to a third-party, commonly assumed to be either TSMC or Samsung.

“We’re not describing any sort of specific external partnerships,” said McVeigh, “but we have been public about the fact that we are utilizing both internal and external foundries for the product. So that’s still the case, (but) we’re still giving no…specifics on who it is.”

In its announcement yesterday, Intel said Argonne Leadership Computing Facility (ALCF) researchers are using software development platforms based on Intel Xe-HP GPUs as part of the Aurora Early Science Program and Exascale Computing Project, which are designed to prepare applications, libraries and infrastructure for Aurora. Intel and Argonne teams are working to co-design, test and validate several exascale applications.

“Having access to Intel Xe-HP GPUs and development tools enables Argonne developers to perform software optimizations across Intel CPUs and GPUs and investigate scenarios that would be difficult to replicate in software-only environments,” Intel said in its announcement. “The Xe -HP GPUs offer a development vehicle to Intel Xe-HPC GPUs (i.e., Ponte Vecchio) that will be used in the Aurora system.”

With oneAPI playing a central role, this is all part of Intel’s “XPU” heterogenous strategy to support multiple architectures, including, for example, GPUs made by AMD and Nvidia, McVeigh said.

“Intel’s under a transformation,” he said, “we’re going from what we call a CPU-centric world to an XPU-centric work…, we really recognize that there’s just a diverse set of workloads and that one architecture is not appropriate for all of them. Oftentimes, they have workloads that have high portions of scalar code, matrix or vector code, or even spatial, and that’s really been driven a lot of our acquisition activities… And we’ve been doing a lot of I would call organic development in the GPU space, to really provide a full portfolio of capabilities to our customers in the ecosystem so they can address their needs and allow them to mix and match whatever best suits their requirements.”

But McVeigh also said that using the oneAPI programming model to develop Aurora applications for Intel Xe-HP GPUs, in lieu of Ponte Vecchio Xe-HPC GPUs, will still leave programming work to be done once Ponte Vecchio and Aurora are shipped.

“A rule of thumb that we typically say is, you’re going to have a starting point of 80 percent or so of the code that will be optimized for the architecture based on a prior (architecture),” McVeigh said. “The fact that XE HP is based on the same architecture, it gives us that kind of confidence that that level of support will be there. Now a lot of the elements of Ponte Vecchio – it has a very large cache, it has different interconnect technology – those things will require a fair amount of work. But in terms of the base code, it should work completely.”

He said Argonne applications developers will have “quarters” (i.e., three-months) of optimization work to maximize performance with Ponte Vecchio. “But they’ll be able to get things running immediately based on the fact that it has the same kind of software support across (Xe-HP and Xe-HPC), but they’re really going to want to tune elements around memory access and intercommunication and all those things.”

Examples of the development work include:

  • The EXAALT project, which enables molecular dynamics at exascale for fusion and fission energy problems.
  • The QMCPACK project, which is developing Quantum Monte Carlo algorithms at exascale to improve predictions concerning complex materials.
  • The GAMESS project, which is developing ab-initio fragmentation methods to more efficiently tackle challenges in computational chemistry, such as heterogeneous catalysis problems.
  • The ExaSMR project, which is developing high-fidelity modeling capabilities at exascale for complex physical phenomena occurring within operating nuclear reactors to ultimately improve their design.
  • The HACC project, which is developing extreme-scale cosmological simulations at exascale that will allow scientists to simultaneously analyze observational data from state-of-the-art telescopes to test different theories.