Ohio Supercomputer Center Launches ‘Ascend’ HPC GPU Cluster

Print Friendly, PDF & Email

Ascend GPU cluster (credit: Ohio Supercomputer Center)

COLUMBUS, Ohio — The Ohio Supercomputer Center (OSC) has officially launched Ascend, its new high performance computing (HPC) cluster for artificial intelligence (AI), data analytics and machine learning.

Ascend is comprised of Dell PowerEdge servers with 48 AMD EPYC CPUs and 96 NVIDIA A100 80GB Tensor Core GPUs with NVIDIA NVLink and interconnected by the NVIDIA Quantum 200Gb/s InfiniBand platform. The center said it triples OSC’s capacity for AI, modeling and simulation and adds 2 petaflops of peak, theoretical performance. The Pitzer and Owens clusters have current capabilities of 5.5 petaflops, more than 20 petabytes of disk storage capacity and more than 23 petabytes of expandable backup storage.

The center said academic and commercial clients may request time on the cluster, which complements OSC’s existing Pitzer and Owens clusters. Ascend is OSC’s first dedicated graphics processing unit (GPU) platform.

“OSC developed Ascend in response to discussions with our client community, stakeholders and vendors, who identified an immediate need for greater GPU resources to process research and simulations that rely on AI, big data and machine learning,” said David Hudak, executive director of OSC. “We are pleased to be able to offer this major new resource to the HPC community and support client advancements in academic research and commercial technologies.”

Although Ascend is smaller than the Pitzer and Owens clusters, with just 24 compute nodes, “its peak performance is on par with all of the other systems, and that’s mainly because of the performance that the GPUs add to these nodes,” said Doug Johnson, OSC associate director.

The Ascend cluster allows clients to quickly move data externally through 25 Gbps Ethernet network interfaces and internally using NVIDIA’s Quantum platform with in-network computing capabilities, Johnson said.

David Hudak, OSC

“OSC’s client services and scientific applications teams will be available to help our clients determine if their applications can make good use of the Ascend GPUs,” Johnson said. “For some applications there is a large performance benefit for using the GPUs and Ascend will make it possible for our clients to tackle some problems that can’t be solved on our current systems.”

OSC tested Ascend from late October to mid-December with a select group of clients to review the performance of the system and solicit feedback from users.

“Part of the goal of the early-user period was to get a better understanding of how the user applications make use of the GPUs that we are supporting in the system,” Johnson said. “We will continue to improve the software and management of the system as we learn more from what we encounter supporting the early users and operating the system for a longer period of time.”

Yu Su, an assistant professor in the Department of Computer Science and Engineering at The Ohio State University, used Ascend during the early testing period for several research projects in the area of natural language processing. Ascend’s NVIDIA GPUs allowed Su to use BLOOM-176B, one of the largest neural network models, for the first time on OSC’s systems.

“Multiple other projects—ranging from question answering over large knowledge bases to knowledge-based reasoning and language interfaces to relational databases—in my group have been benefiting from the NVIDIA A100 GPUs on Ascend,” Su said. “Students have reported double to triple the speed of processing compared with the A6000 GPU servers my group has.”

Bargeen Turzo, a graduate student in the Steffen Lindert computational chemistry group at Ohio State, found that Ascend could more quickly process jobs that predicted protein complex structures than the Pitzer cluster could.

“For some large proteins I was not even able to get a single prediction on Pitzer after running the calculation for multiple weeks,” Turzo said. “While on Ascend the same calculation finished in 12 hours.”

Ascend follows the same pricing structure as the Owens and Pitzer clusters. More information about Ascend’s hardware, software environment, file systems, batch specifics and connection details can be found on the Ascend cluster page of the OSC site.

In addition to launching Ascend, OSC is in the planning stages of developing a high performance computing cluster that will replace its Owens cluster, slated for launch in 2023. The new cluster will initially run concurrently with Owens and complement Pitzer and Ascend.

Rajesh Pohani, Dell Technologies

“Dell Technologies is working with the Ohio Supercomputer Center to help industry and academic researchers pioneer in their respective fields with the latest in advanced computing technology and expertise,” says Rajesh Pohani, vice president of PowerEdge, Core Compute and High Performance Computing, Dell Technologies. “Ascend’s AI capabilities, enhanced by powerful PowerEdge XE8545 servers, will complement and significantly expand the advanced computing resources essential to engineering innovation and scientific discovery that is ultimately helping to move forward human progress.”

“Modern scientific computing is pushing the boundaries of HPC and AI at scale,” said Timothy Costa, Director of HPC and Quantum product at NVIDIA.  “As businesses continue to innovate and expand their portfolios, NVIDIA will empower OSC and other key researchers worldwide to advance engineering and scientific discoveries.”

“AMD EPYC processors support researchers around the world with the performance and productivity needed to answer some of science’s biggest questions,” said Brock Taylor, director of high performance computing, AMD. “We’re excited the Ascend supercomputer will help the Ohio Supercomputer Center advance their mission to advance levels of artificial intelligence, machine learning, big data and data analytics.”

source: Ohio Suupercomputer Center