“The Project Olympus hyperscale GPU accelerator chassis for AI, also referred to as HGX-1, is designed to support eight of the latest “Pascal” generation NVIDIA GPUs and NVIDIA’s NVLink high speed multi-GPU interconnect technology, and provides high bandwidth interconnectivity for up to 32 GPUs by connecting four HGX-1 together. The HGX-1 AI accelerator provides extreme performance scalability to meet the demanding requirements of fast growing machine learning workloads, and its unique design allows it to be easily adopted into existing datacenters around the world.”
In this podcast, the Radio Free HPC team looks at a set of IT and Science stories. Microsoft Azure is making a big move to GPUs and the OCP Platform as part of their Project Olympus. Meanwhile, Huawei is gaining market share in the server market and IBM is bringing storage to the atomic level.
Today, Microsoft, NVIDIA, and Ingrasys announced a new industry standard design to accelerate Artificial Intelligence in the next generation cloud. “Powered by eight NVIDIA Tesla P100 GPUs in each chassis, HGX-1 features an innovative switching design based on NVIDIA NVLink interconnect technology and the PCIe standard, enabling a CPU to dynamically connect to any number of GPUs. This allows cloud service providers that standardize on the HGX-1 infrastructure to offer customers a range of CPU and GPU machine instance configurations.”
“Available on GitHub as Open Source, the Batch Shipyard toolkit enables easy deployment of batch-style Dockerized workloads to Azure Batch compute pools. Azure Batch enables you to run parallel jobs in the cloud without having to manage the infrastructure. It’s ideal for parametric sweeps, Deep Learning training with NVIDIA GPUs, and simulations using MPI and InfiniBand.”
“Run your Windows and Linux HPC applications using high performance A8 and A9 compute instances on Azure, and take advantage of a backend network with MPI latency under 3 microseconds and non-blocking 32 Gbps throughput. This backend network includes remote direct memory access (RDMA) technology on Windows and Linux that enables parallel applications to scale to thousands of cores. Azure provides you with high memory and HPC-class CPUs to help you get results fast. Scale up and down based upon what you need and pay only for what you use to reduce costs.”
Today the PASC17 Conference announced that Matthias Troyer from Microsoft Research will give this year’s public lecture on the topic “Towards Quantum High Performance Computing.” The event will take place June 26-28 in Lugano, Switzerland.
Today Cray announced the results of a deep learning collaboration with Microsoft CSCS designed to expand the horizons of running deep learning algorithms at scale using the power of Cray supercomputers. “Cray’s proficiency in performance analysis and profiling, combined with the unique architecture of the XC systems, allowed us to bring deep learning problems to our Piz Daint system and scale them in a way that nobody else has,” said Prof. Dr. Thomas C. Schulthess, director of the Swiss National Supercomputing Centre (CSCS). “What is most exciting is that our researchers and scientists will now be able to use our existing Cray XC supercomputer to take on a new class of deep learning problems that were previously infeasible.”
Today Microsoft released an updated version of Microsoft Cognitive Toolkit, a system for deep learning that is used to speed advances in areas such as speech and image recognition and search relevance on CPUs and Nvidia GPUs. “We’ve taken it from a research tool to something that works in a production setting,” said Frank Seide, a principal researcher at Microsoft Artificial Intelligence and Research and a key architect of Microsoft Cognitive Toolkit.
In this video from the Microsoft Ignite Conference, Tejas Karmarkar describes how to run your HPC Simulations on Microsoft Azure – with UberCloud container technology. “High performance computing applications are some of the most challenging to run in the cloud due to requirements that can include fast processors, low-latency networking, parallel file systems, GPUs, and Linux. We show you how to run these engineering, research and scientific workloads in Microsoft Azure with performance equivalent to on-premises. We use customer case studies to illustrate the basic architecture and alternatives to help you get started with HPC in Azure.”
“We are still in the first minutes of the first day of the Intelligence revolution. In this keynote, Dr. Joseph Sirosh will present 5 solutions (and their implementations) that the intelligent cloud delivers. Sirosh shares five cloud AI patterns that his team and presented at the Summit. These five patterns are really about ways to bring data and learning together in cloud services, to infuse intelligence.”