In April, NERSC finalized its contract with Cray — which was acquired by HPE in September 2019 — for Perlmutter, which will provide 3-4 times the capability of NERSC’s current supercomputer, Cori.
NERSC is excited to disclose new details about the impact of this technology on Perlmutter’s high performance computing capabilities, which are designed to enhance simulation, data processing, and machine learning applications for our diverse user community,” said Nick Wright, who leads the Advanced Technologies Group at NERSC and has been the chief architect on Perlmutter.
The A100, NVIDIA’s first chip based on its NVIDIA Ampere architecture, is a 7-nanometer GPU processor with more than 54 billion transistors. It features a number of technical advances, including:
- Multi-instance GPU technology, a new feature that enables a single A100 GPU to be partitioned into up to seven separate GPUs
- Third-generation NVLink™ technology that enhances high-speed interconnectivity
- Third-generation Tensor Core technology that increases throughput and efficiency.
NVIDIA is bringing Tensor Core functionality up to double precision with its Ampere architecture,” said Jack Deslippe, group lead for NERSC’s Application Performance Group. “This is particularly exciting for HPC users because it enables key dense-linear algebra-like routines to achieve an additional 2x in performance.” According to Deslippe, two applications currently computing at NERSC — NWChemEx and BerkeleyGW — have already prototyped use of this new functionality and are seeing close to a 2x increase in performance on Ampere over NVIDIA’s previous generation Volta processor.
This is the latest development in NERSC’s efforts to prepare users for the next-generation GPU processors that will be featured in the heterogeneous Perlmutter supercomputer, alongside the system’s AMD CPUs. Nearly half of the workload running at NERSC is poised to take advantage of GPU acceleration, and NERSC, HPE, and NVIDIA have been working together over the last two years to help the scientific community prepare to leverage GPUs for a broad range of research workloads.
Using the NVIDIA Volta GPUs currently installed in NERSC’s Cori system, we’ve been adding GPU acceleration to our applications, optimizing GPU-accelerated code where it already exists, and targeting changes that take advantage of the Ampere GPU architecture,” Deslippe said.
Examples of these efforts can be found in several presentations given by NERSC staff during NVIDIA’s virtual GTC 2020 conference, held March 23-26:
- Brian Friesen, Doug Jacobsen, Integrating NVIDIA Tesla V100 GPUs into a Cray System for a Diverse Simulation, Machine Learning, and Data Workload
- Charlene Yang, Mauro Del Ben, Accelerating Large-Scale GW Calculations in Material Science
- Chris Daley, Accelerating Applications for the NERSC Perlmutter Supercomputer Using Open MP
- Debbie Bard, Doug Jacobsen, Workload Management for Complex Workflows on a GPU-Enabled Heterogeneous System
- Jack Deslippe, Jonathan Madsen, Muaaz Awan, Enabling 800 Projects for GPU-Accelerated Science on Perlmutter at NERSC
- Sam Williams, Charlene Yang, Roofline Performance Model for HPC and Deep-Learning Applications
In addition to supporting traditional simulation codes, Perlmutter was designed from the outset to be a world-class resource for DOE’s rapidly growing experimental data analytics and learning workloads,” Wright said. “We look forward to see what amazing science results our users obtain on Perlmutter.”
Sign up for our insideHPC Newsletter