The flagship supercomputer at the Swiss National Supercomputing Centre (CSCS), Piz Daint, named after a mountain in the Alps, currently delivers 7.8 petaflops of compute performance, or 7.8 quadrillion mathematical calculations per second. A recently announced upgrade will double its peak performance, thanks to a refresh using the latest Intel Xeon CPUs and 4,500 Nvidia Tesla P100 GPUs.
“Fujitsu Laboratories has newly developed parallelization technology to efficiently share data between machines, and applied it to Caffe, an open source deep learning framework widely used around the world. Fujitsu Laboratories evaluated the technology on AlexNet, where it was confirmed to have achieved learning speeds with 16 and 64 GPUs that are 14.7 and 27 times faster, respectively, than a single GPU. These are the world’s fastest processing speeds(2), representing an improvement in learning speeds of 46% for 16 GPUs and 71% for 64 GPUs.”
Advancements in video technology have slowly pushed applications like video editing, video rendering and video storage editing into the High Performance Computing world. There are many different video editing programs that can cut, trim, re-sequence, and add sound, transitions and special effects to video. But with the introduction of 4K/8K video, a simple laptop isn’t powerful enough on its own anymore, especially for online editing.
AMD’s motivation for developing these open-source GPU tools is based on an opportunity to remove the added complexity of proprietary programming frameworks to GPU application development. “If successful, these tools – or similar versions – could help to democratize GPU application development, removing the need for proprietary frameworks, which then makes the HPC accelerator market much more competitive for smaller players. For example, HPC users could potentially use these tools to convert CUDA code into C++ and then run it on an Intel Xeon co-processor.”
Steve Oberlin from Nvidia presented this talk at The Digital Future conference. “Oberlin will discuss machine learning and neural networks, explore a few advanced applications based on deep learning algorithms, discuss the foundation and architecture of representative algorithms, and illustrate the pivotal role GPU acceleration is playing in this exciting and rapidly expanding field.”
In this special guest feature, Rob Farber writes that a study done by Kyoto University Graduate School of Medicine shows that code modernization can help Intel Xeon processors outperform GPUs on machine learning code. “The Kyoto results demonstrate that modern multicore processing technology now matches or exceeds GPU machine-learning performance, but equivalently optimized software is required to perform a fair benchmark comparison. For historical reasons, many software packages like Theano lacked optimized multicore code as all the open source effort had been put into optimizing the GPU code paths.”
In this video from ISC 2016, Greg Schmidt from Hewlett Packard Enterprise describes the new Apollo 6500 server. With up to eight high performance NVIDIA GPUs designed for maximum transfer bandwidth, the HPE Apollo 6500 is purpose-built for HPC and deep learning applications. Its high ratio of GPUs to CPUs, dense 4U form factor and efficient design enable organizations to run deep learning recommendation algorithms faster and more efficiently, significantly reducing model training time and accelerating the delivery of real-time results, all while controlling costs.
Today One Stop Systems (OSS) announced that it has completed a merger with Magma, with OSS as the surviving entity. Both companies are market leaders in PCIe expansion technology used to create high-end compute accelerators and flash storage arrays. Together they become a dominant technology leader of PCIe expansion appliances.