In this video from the 2016 Blue Waters Symposium, GPU Performance Nuggets – Carl Pearson and Simon Garcia De Gonzalo from the University of Illinois present: GPU Performance Nuggets. “In this talk, we introduce a pair of Nvidia performance tools available on Blue Waters. We discuss what the GPU memory hierarchy provides for your application. We then present a case study that explores if memory hierarchy optimization can go too far.”
This week Nvidia CEO Jen-Hsun Huang hand-delivered one of the company’s new DGX-1 Machine Learning supercomputers to the OpenAI non-profit in San Francisco. “The DGX-1 is a huge advance,” OpenAI Research Scientist Ilya Sutskever said. “It will allow us to explore problems that were completely unexplored before, and it will allow us to achieve levels of performance that weren’t achievable.”
Altair’s new Data Center GPU Management Tool is now available to Nvidia HPC Customers. With the wide adoption of Graphics Processing Units, customers are addressing vital work in fields including artificial intelligence, deep learning, self-driving cars, and virtual reality now have the ability to improve the speed and reliability of their computations through a new technology collaboration with Altair to integrate PBS Professional.
Deep learning solutions are typically a part of a broader high performance analytics function in for profit enterprises, with a requirement to deliver a fusion of business and data requirements. In addition to support large scale deployments, industrial solutions typically require portability, support for a range of development environments, and ease of use.
“Few fields are moving faster right now than deep learning,” writes Buck. “Today’s neural networks are 6x deeper and more powerful than just a few years ago. There are new techniques in multi-GPU scaling that offer even faster training performance. In addition, our architecture and software have improved neural network training time by over 10x in a year by moving from Kepler to Maxwell to today’s latest Pascal-based systems, like the DGX-1 with eight Tesla P100 GPUs. So it’s understandable that newcomers to the field may not be aware of all the developments that have been taking place in both hardware and software.”
Nvidia is expanding its popular GPU Technology Conference to eight cities worldwide. “We’re broadening the reach of GTC with a series of conferences in eight cities across four continents, bringing the latest industry trends to major technology centers around the globe. Beijing, Taipei, Amsterdam, Melbourne, Tokyo, Seoul, Washington, and Mumbai will all host GTCs. Each will showcase technology from NVIDIA and our partners across the fields of deep learning, autonomous driving and virtual reality. Several events in the series will also feature keynote presentations by NVIDIA CEO and co-founder Jen-Hsun Huang.”
The flagship supercomputer at the Swiss National Supercomputing Centre (CSCS), Piz Daint, named after a mountain in the Alps, currently delivers 7.8 petaflops of compute performance, or 7.8 quadrillion mathematical calculations per second. A recently announced upgrade will double its peak performance, thanks to a refresh using the latest Intel Xeon CPUs and 4,500 Nvidia Tesla P100 GPUs.
The recent introduction of new high end processors from Intel combined with accelerator technologies such as NVIDIA Tesla GPUs and Intel Xeon Phi provide the raw ‘industry standard’ materials to cobble together a test platform suitable for small research projects and development. When combined with open source toolkits some meaningful results can be achieved, but wide scale enterprise deployment in production environments raises the infrastructure, software and support requirements to a completely different level.
Today the Green500 released their listing of the world’s most energy efficient supercomputers. “Japan’s research institution RIKEN once again captured the top spot with its Shoubu supercomputer. With rating of 6673.84 MFLOPS/Watt, Shoubu edged out another RIKEN system, Satsuki, the number 2 system that delivered 6195.22 MFLOPS/Watt. Both are “ZettaScaler”supercomputers, employing Intel Xeon processors and PEZY-SCnp manycore accelerators.