NVIDIA Long-Haul InfiniBand at Purdue University – Extending Accelerated Research Across Campus

Sponsored Post

By Ran Holzman

For data-driven researchers, the time-related expense of moving data from machines between data centers slows computation and causes costly delays in results. Plus, data center space can be limited for many organizations, even  well-established academic campuses. To address these issues, Purdue University deployed NVIDIA InfiniBand MetroX across campus in 2011, connecting remote computation clusters to remote storage facilities. This resulted in higher facilities utilization,—all without increased construction or expensive building retrofit costs.

The NVIDIA MetroX long-haul system builds upon the leadership momentum established by the NVIDIA Quantum InfiniBand platform. Extending the benefits of the NVIDIA Quantum InfiniBand platform beyond local data centers and storage clusters, MetroX systems enable connectivity between data centers deployed across multiple geographically distributed sites. At Purdue, MetroX long-haul InfiniBand has allowed researchers to run more complex simulations and further advance their cutting-edge research in many areas, such as climate change, aerospace, and molecular biology.

Broadening access to advanced computing

The NVIDIA MetroX solution unifies cross-campus systems, maintaining the high-speed access researchers require to perform intricate simulations, regardless of a researcher’s physical work location. This means that data center expansion and disaster recovery sites can reap the benefits of NVIDIA’s extremely fast interconnect solutions—with no performance degradation.

Purdue University is also the home of Anvil, a powerful new supercomputer that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high performance computing to modern artificial intelligence applications. Today, Purdue is planning the future of their campus supercomputing capabilities, bringing NVIDIA Quantum InfiniBand and A100 GPUs to Anvil, and even extending it to four edge labs with MetroX. This brings centralized computing to remote scientific experiments and extends it to community clusters.

Industry Partnerships to Cross-Campus Research

MetroX solutions extend NVIDIA Quantum InfiniBand capabilities to distances of up to 40km, aggregating data and storage networking over a single, consolidated fabric. MetroX will soon expand scientific discovery, already underway at Purdue, in the areas of visualization, cryo-electron microscopy, digital agriculture, and even hypersonics. Additionally, MetroX will connect the new Discovery Park District to Anvil, where industry partners, coupled with advanced AI techniques, are being used to further advance Purdue as a national leader in campus-wide supercomputing.

More about MetroX Solutions

MetroX long-haul RDMA technology guarantees high performance, high-volume data sharing and load balancing between distant sites, enabling data center expansion, disaster recovery, data mirroring, and long-distance connectivity. The high bandwidth, low latency MetroX solution also simplifies high load balancing configurations and is designed for high availability and redundancy. That is, if there’s ever a problem with one of the appliances, the other takes over.

Additionally, the MetroX GUI-based web management system provides full alarm, event history, activities log and performance monitoring for all optical modules. For managing scale-out computing environments, the system can be coupled with NVIDIA Unified Fabric Manager (UFM®) software. UFM enables data center operators to efficiently provision, monitor, and operate the modern data center fabric.

NVIDIA MetroX is the ideal cost-effective, low power, easily-managed, and scalable solution for running over a pre-installed fiber infrastructure. From across the campus to around the globe, NVIDIA MetroX long-haul systems extend the benefits of InfiniBand to efficiently move high volumes of data between data centers. Delivering the highest performance and lowest latency with a complete fabric management solution, MetroX solutions are perfect for data center recovery and providing better utilization of remote storage or compute infrastructures.

Learn more at:

NVIDIA InfiniBand Long-Haul Systems

https://www.rcac.purdue.edu/anvil

https://www.purdue.edu/newsroom/releases/2020/Q2/purdue-receives-10-million-from-national-science-foundation-for-anvil-supercomputer.html

https://discoveryparkdistrict.com/

About the Author: 

Ran Holzman, Senior Product Marketing Manager, NVIDIA

Ran Holzman is a senior product marketing manager for NVIDIA Networking, focused on high performance computing, AI, and InfiniBand technology. Since 2019, he’s also served as the switch, UFM, and Interconnect product manager. Ran holds a bachelor’s degree in Computer Engineering (B.Sc), and a master’s degree in Business Administration (MBA) from the Hebrew University of Jerusalem, Israel.