Oakforest-PACS: Overview of the Fastest Supercomputer in Japan

Prof. Taisuke Boku from the University of Tsukuba & JCAHPC presented this talk at the DDN User Group at SC16. “Thanks to DDN’s IME Burst Buffer, researchers using Oakforest-PACS at the Joint Center for Advanced High Performance Computing (JCAHPC) are able to improve modeling of fundamental physical systems and advance understanding of requirements for Exascale-level systems architectures. With DDN’s advanced technology, JCAHPC has achieved effective I/O performance exceeding 1TB/s in writing tens of thousands of processes to the same file.”

Altair Moves Forward with Open Source PBS Pro and PBS Cloud at SC16

In this video from SC16, Bill Nitzberg provides an update on the Open Source release of PBS Professional workload management software. After that, Jeremie Bourdoncle announces the new Altair PBS Cloud. “Altair is excited to announce the upcoming availability of Altair PBS Cloud, its latest appliance solution to further cloud computing for organizations. Altair PBS Cloud is the solution to build and run high-performance computing (HPC) appliances for both public clouds, private clouds, and bare-metal infrastructure. Altair will release Altair PBS Cloud in the first quarter of 2017 following conclusion of a private preview.”

GPUs & Deep Learning in the Spotlight for Nvidia at SC16

In this video from SC16, Roy Kim from Nvidia describes how the company is bringing in a new age of AI with accelerated computing for Deep Learning applications. “Deep learning is the fastest-growing field in artificial intelligence, helping computers make sense of infinite amounts of data in the form of images, sound, and text. Using multiple levels of neural networks, computers now have the capacity to see, learn, and react to complex situations as well or better than humans. This is leading to a profoundly different way of thinking about your data, your technology, and the products and services you deliver.”

How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

“Individual institutions or organizations will have opportunities to deploy storage locally and can federate their local repository into the national system,” says Dr. Greg Newby, Compute Canada’s Chief Technology Officer. “This provides enhanced privacy and sharing capabilities on a robust, country-wide solution with improved data security and back-up. This is a great solution to address the data explosion we are currently experiencing in Canada and globally.”

HPE Apollo 6500 for Deep Learning

“With up to eight high performance NVIDIA GPUs designed for maximum transfer bandwidth, the HPE Apollo 6500 is purpose-built for HPC and deep learning applications. Its high ratio of GPUs to CPUs, dense 4U form factor and efficient design enable organizations to run deep learning recommendation algorithms faster and more efficiently, significantly reducing model training time and accelerating the delivery of real-time results, all while controlling costs.”

Radio Free HPC Reviews the SC16 Student Cluster Competition Configurations & Results

In this podcast, the Radio Free HPC team reviews the results from SC16 Student Cluster Competition. “This year, the advent of clusters with the new Nvidia Tesla P100 GPUs made a huge impact, nearly tripling the Linpack record for the competition. For the first-time ever, the team that won top honors also won the award for achieving highest performance for the Linpack benchmark application. The team “SwanGeese” is from the University of Science and Technology of China. In traditional Chinese culture, the rare Swan Goose stands for teamwork, perseverance and bravery.”

Kx Streaming Analytics Crunches 1.2 Billion NYC Taxi Data Points using Intel Xeon Phi

“The complexity and high costs of architecting and maintaining streaming analytics solutions often make it difficult to get new projects off the ground. That’s part of the reason Kx, a leading provider of high-volume, high-performance databases and real-time analytics solutions, is always interested in exploring how new technologies may help it push streaming analytics performance and efficiency boundaries. The Intel Xeon Phi processor is a case in point. At SC16 in Salt Lake City, Kx used a 1.2 billion record database of New York City taxi cab ride data to demonstrate what the Intel Xeon Phi processor could mean to distributed big data processing. And the potential cost/performance implications were quite promising.”

Lenovo Boosts Marconi Supercomputer to 6.2 Petaflops with Intel Xeon Phi

“Phase one at CINECA, an academic consortium, was completed in May 2016 – coming in at 1.7 Petaflops, which at the time it was the largest Intel Omni-Path Fabric system in the world. Lenovo and CINECA are pleased to announce the delivery and installation of phase two, a 3,600 node Intel Xeon Phi processor which is interconnected with 100Gb Intel Omni-Path fabric – delivering 6.2 Petaflops of performance.”

Intel Xeon Phi with Software Defined Visualization at SC16

“Software Defined Visualization (SDVis) is an open source initiative from Intel and industry collaborators to improve the visual fidelity, performance and efficiency of prominent visualization solutions – with a particular emphasis on supporting the rapidly growing “Big Data” usage on workstations through HPC supercomputing clusters without the memory limitations and cost of GPU based solutions. Existing applications can be enhanced using the high performing parallel software rendering libraries OpenSWR, Embree, and OSPRay. At the Intel HPC Developer Conference, Amstutz provided an introduction to this initiative, its benefits, a brief descriptions of accomplishments in the past year and talk about the changes made to Intel provided libraries in the past year.”

HIP and CAFFE Porting and Profiling with AMD’s ROCm

In this video from SC16, Ben Sander from AMD presents: HIP and CAFFE Porting and Profiling with AMD’s ROCm. “We are excited to present ROCm, the first open-source HPC/Hyperscale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application. ROCm is built for scale; it supports multi-GPU computing in and out of server-node communication through RDMA.”