Scaling Deep Learning for Scientific Workloads on the #1 Summit Supercomputer

Jack Wells from ORNL gave this talk at the GPU Technology Conference. “HPC centers have been traditionally configured for simulation workloads, but deep learning has been increasingly applied alongside simulation on scientific datasets. These frameworks do not always fit well with job schedulers, large parallel file systems, and MPI backends. We’ll share benchmarks between native compiled versus containers on Power systems, like Summit, as well as best practices for deploying learning and models on HPC resources on scientific workflows.”

NVIDIA Releases Cuda 9.2 for GPU Code Acceleration

Today NVIDIA released Cuda 9.2, which includes updates to libraries, a new library for accelerating custom linear-algebra algorithms, and lower kernel launch latency. “CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust.”

Gauss Centre in Germany Allocates 1 Billion Computing Core Hours for Science

“With the 19th Call for Large-Scale Projects, the GCS steering committee granted a total of more than 1 billion core hours to 17 ambitious research projects. The research teams represent a wide range of scientific disciplines, including astrophysics, atomic and nuclear physics, biology, condensed matter physics, elementary particle physics, meteorology, and scientific engineering, among others.”

New AI Performance Milestones with NVIDIA Volta GPU Tensor Cores

Over at the NVIDIA blog, Loyd Case shares some recent advancements that deliver dramatic performance gains on GPUs to the AI community. “We have achieved record-setting ResNet-50 performance for a single chip and single server with these improvements. Recently, fast.ai also announced their record-setting performance on a single cloud instance. A single V100 Tensor Core GPU achieves 1,075 images/second when training ResNet-50, a 4x performance increase compared to the previous generation Pascal GPU.”

Iridis 5 Supercomputer to Simplify Use of HPC

In this special guest feature from Scientific Computing World, Robert Roe talks to Southampton University’s Oz Parchment about the decision-making behind installing the latest HPC system at the University. “Iridis 4 was based On Sandy Bridge and the current one we have got now is based on Sky Lake so you can see we have jumped four generations of development. Four years is a long time in HPC performance,” stated Oz Parchment, director of i-solutions at the University of Southampton.

Video: Liqid Teams with Inspur at GTC for Composable Infrastructure

In this video from GTC 2018, Dolly Wu from Inspur and Marius Tudor from Liquid describe how the two companies are collaborating on Composable Infrastructure for AI and Deep Learning workloads. “AI and deep learning applications will determine the direction of next-generation infrastructure design, and we believe dynamically composing GPUs will be central to these emerging platforms,” said Dolly Wu, GM and VP Inspur Systems.

NVIDIA rolls out GV100 “Dual-Volta” GPU for Workstations

Today NVIDIA announced Quadro GV100 GPU. With innovative packaging, the Quadro GV100 comprises two Volta GPUs in the same chassis — linked with NVIDIA’s new NVlink 2 interconnect. “The new AI-dedicated Tensor Cores have dramatically increased the performance of our models and the speedier NVLink allows us to efficiently scale multi-GPU simulations.”

TUK in Germany installs NEC LX Supercomputer with Intel Omni-Path

NEC Deutschland GmbH has delivered an LX series supercomputer to Technische Universität Kaiserslautern (TUK), one of Germany’s leading Universities of Technology. “The new HPC cluster consists of 324 compute nodes totaling nearly 7,800 cores of the latest-generation Intel Skylake CPUs, and comprises a highly optimized Intel Omni-Path Interconnect architecture for low-latency, high-bandwidth communication. Additional GPGPU compute nodes equipped with the latest NVIDIA VOLTA 100 GPUs contribute to a total peak performance of the HPC cluster at approximately 700 Teraflops.”

Video: The Sierra Supercomputer – Science and Technology on a Mission

Adam Bertsch from LLNL gave this talk at the Stanford HPC Conference. “Our next flagship HPC system at LLNL will be called Sierra. A collaboration between multiple government and industry partners, Sierra and its sister system Summit at ORNL, will pave the way towards Exascale computing architectures and predictive capability.”

Inside SATURNV – Insights from NVIDIA’s Deep Learning Supercomputer

Phil Rogers from NVIDIA gave this talk at SC17. “Like its namesake, In this talk, we describe the architecture of SATURNV, and how we use it every day at NVIDIA to run our deep learning workloads for both production and research use cases. We explore how the NVIDIA GPU Cloud software is used to manage and schedule work on SATURNV, and how it gives us the agility to rapidly respond to business-critical projects. We also present some of the results of our research in operating this unique GPU-accelerated data center.”