Supermicro HGX-2 Cloud Server to Sport 2 Petaflops of Ai Horsepower

Print Friendly, PDF & Email

Today Supermicro announced that the company’s upcoming NVIDIA HGX-2 cloud server platform will be the world’s most powerful system for artificial intelligence and HPC capable of performing at 2 PetaFLOPS.

Supermicro’s new SuperServer based on the HGX-2 platform will deliver more than double the performance of current systems, which will help enterprises address the rapidly expanding size of AI models that sometimes require weeks to train,” said Charles Liang, president and CEO of Supermicro. “Our new HGX-2 system will enable efficient training of complex models. It combines sixteen Tesla V100 32GB SXM3 GPUs connected via NVLink and NVSwitch to work as a unified 2 PetaFlop accelerator with half a terabyte of aggregate GPU memory to deliver unmatched compute power.”

From natural speech by computers to autonomous vehicles, rapid progress in AI has transformed entire industries. To enable these capabilities, AI models are exploding in size. HPC applications are similarly growing in complexity as they unlock new scientific insights. Supermicro’s HGX-2 based SuperServer (SYS-9029GP-TNVRT) will provide a superset design for datacenters accelerating AI and HPC in the cloud. With fine-tuned optimizations, this SuperServer will deliver the highest compute performance and memory for rapid model training.

Supermicro GPU systems also support the ultra-efficient Tesla T4 that is designed to accelerate inference workloads in any scale-out server. The hardware accelerated transcode engine in Tesla T4 delivers multiple HD video streams in real-time and allows integrating deep learning into the video transcoding pipeline to enable a new class of smart video applications. As deep learning shapes our world like no other computing model in history, deeper and more complex neural networks are trained on exponentially larger volumes of data. To achieve responsiveness, these models are deployed on powerful Supermicro GPU servers to deliver maximum throughput for inference workloads.

With the convergence of big data analytics and machine learning, the latest NVIDIA GPU architectures, and improved machine learning algorithms, deep learning applications require the processing power of multiple GPUs that must communicate efficiently and effectively to expand the GPU network. Supermicro’s single-root GPU system allows multiple NVIDIA GPUs to communicate efficiently to minimize latency and maximize throughput as measured by the NCCL P2PBandwidthTest.

Sign up for our insideHPC Newsletter