Today NVIDIA unveiled the NVIDIA DGX-2: the “world’s largest GPU.” Ten times faster than its predecessor, the DGX-2 the first single server capable of delivering two petaflops (we’re talking AI Flops here, folks) of computational power. DGX-2 has the deep learning processing power of 300 servers occupying 15 racks of datacenter space, while being 60x smaller and 18x more power efficient.
The extraordinary advances of deep learning only hint at what is still to come,” said Jensen Huang, NVIDIA founder and CEO, as he unveiled the news at GTC 2018. “Many of these advances stand on NVIDIA’s deep learning platform, which has quickly become the world’s standard. We are dramatically enhancing our platform’s performance at a pace far exceeding Moore’s law, enabling breakthroughs that will help revolutionize healthcare, transportation, science exploration and countless other areas.”
Combined with a fully optimized, updated suite of NVIDIA deep learning software, DGX-2 is purpose-built for data scientists pushing the outer limits of deep learning research and computing.
According to NVIDIA, the DGX-2 can train FAIRSeq, a state-of-the-art neural machine translation model, in less than two days — a 10x improvement in performance from the DGX-1 with Volta, introduced in September. NVIDIA’s new DGX-2 system reached the two petaflop milestone by drawing from a wide range of industry-leading technology advances developed by NVIDIA at all levels of the computing stack.
DGX-2 is the first system to debut NVSwitch, which enables all 16 GPUs in the system to share a unified memory space. Developers now have the deep learning training power to tackle the largest datasets and most complex deep learning models.
NVSwitch: A Revolutionary Interconnect Fabric
NVSwitch offers 5x higher bandwidth than the best PCIe switch, allowing developers to build systems with more GPUs hyperconnected to each other. It will help developers break through previous system limitations and run much larger datasets. It also opens the door to larger, more complex workloads, including modeling parallel training of neural networks.
NVSwitch extends the innovations made available through NVIDIA NVLink™, the first high-speed interconnect technology developed by NVIDIA. NVSwitch allows system designers to build even more advanced systems that can flexibly connect any topology of NVLink-based GPUs.
Advanced GPU-Accelerated Deep Learning and HPC Software Stack
The updates to NVIDIA’s deep learning and HPC software stack are available at no charge to its developer community, which now totals more than 820,000 registered users, compared with about 480,000 a year ago.
Among its updates are new versions of NVIDIA CUDA, TensorRT, NCCL and cuDNN, and a new Isaac software developer kit for robotics. Additionally, through close collaboration with leading cloud service providers, every major deep learning framework is continually optimized to take full advantage of NVIDIA’s GPU computing platform.
DGX-2, with 16 GPUs, is the top of the lineup. It joins the NVIDIA DGX-1 system, which features eight Tesla V100 GPUs, and DGX Station™, the world’s first personal deep learning supercomputer, with four Tesla V100 GPUs in a compact, deskside design. These systems enable data scientists to scale their work from the complex experiments they run at their desks to the largest deep learning problems, allowing them to do their life’s work.
Tesla V100 Gets Double the Memory
The Tesla V100 GPU, widely adopted by the world’s leading researchers, has received a 2x memory boost to handle the most memory-intensive deep learning and high performance computing workloads.
Now equipped with 32GB of memory, Tesla V100 GPUs will help data scientists train deeper and larger deep learning models that are more accurate than ever. They can also improve the performance of memory-constrained HPC applications by up to 50 percent compared with the previous 16GB version.
The Tesla V100 32GB GPU is immediately available across the complete NVIDIA DGX system portfolio.