Purdue’s ‘Anvil’ to Be Driven by Dell, AMD ‘Milan’ CPUs, Nvidia A100 Tensor Core GPUs

Another in a series of National Science Foundation supercomputing awards has been announced, this one a $10 million funding for a system to be housed at Purdue University to support HPC and AI workloads and scheduled to enter production next year.

The system, dubbed Anvil, will be built in partnership with Dell and AMD and will consist of 1,000 128-core AMD Epyc “Milan” third-generation 7nm CPUs, which are due out later this year. Purdue said the system will have a peak performance of 5.3 petaflops and will deliver more than 1 billion CPU core hours for researchers within NSF’s Extreme Science and Engineering Discovery Environment (XSEDE) for five years. Anvil nodes will be interconnected with 100 Gbps Mellanox HDR InfiniBand, and its ecosystem also will include 32 large memory nodes, each with 1 TB of RAM, and 16 nodes each with four Nvidia A100 Tensor Core GPUs providing 1.5 PF of single-precision performance.

The system will leverage a diverse set of block and object storage technologies anchored by a 10+ PB parallel file system and boosted with more than 3 PB of flash disk, according to Purdue. Storage for active projects and archival data will be provided by Purdue’s Research Data Depot and Fortress archive.

Anvil will include features designed to broaden access, such as interactive computing and visualization capabilities, and an integrated web-based Open OnDemand gateway to Anvil’s software tools and compute nodes. A composable subsystem will enable cloud and container-based workflows to run alongside the system and will support scientific applications, including gateways, databases, high-throughput data ingestion pipelines and complex coupled modeling workflows. And it will offer pathways to Microsoft Azure cloud.

Anvil will be built alongside the university’s community cluster supercomputers, including the 2020 “Bell” system being built for the Purdue campus, leveraging the school’s infrastructure that includes high-capacity storage systems, high-speed networking and the ITaP (Information Technology at Purdue) staff that has deployed 14 supercomputers since 2008.

Preston Smith, executive director of research computing and a co-PI on the project, said Anvil will be optimized for traditional parallel computing for research in such areas as fluid dynamics and bioinformatics, and also for data science, AI and machine learning applications.

“Anvil also will serve as an experiential learning laboratory for students to gain real-world experience using computing for their science, and for student interns to work with the Anvil team for construction and operation. We will be training the research computing practitioners of the future,” he said.

The project is funded under NSF award number 2005632. Xiao Zhu, a computational scientist and senior research scientist for research computing, and Rajesh Kalyanam, a data scientist and software engineer and research scientist for research computing, are co-PIs on the project.

Comments

  1. AMD Epyc is just so much more advanced thea Intel Xeon that it is no wonder why all the super computers for over 2 yrs now have been built with them. Including the one at Oak Ridge National Laboratory in TN. AMD by way of Taiwan Semiconductor Manufacturing Company (TSMC) has shrank their processor technology to 7 NM which is twice as small as Intel has been stuck on now for over 5 yrs. I built a new Amd Ryzen 8 core system in 2019 of which I am very pleased with.