Sign up for our newsletter and get the latest HPC news and analysis.

Lustre & Kerberos: In Theory and In Practice


“Kerberos is the most famous way to allow safe communications over a non-secure network, by providing authentication and encryption. Plain old has a section about Kerberos Lustre setup, and in the past presentations have demonstrated Lustre deployments on Kerberized production clusters. But in 2013 others have related a failed attempt to kerberize Lustre 2.4 exchanges. And even Kerberos support has been removed from official up-to-date Lustre Operations Manual. However, our work on Lustre 2.7 has been beneficial to Lustre Kerberos support. This presentation will show how far Lustre can go in Kerberos security, and what kind of authentication and encryption we can get to work. We will also take an interest in the impact of various Kerberos flavors over performance.”

Video: Accelerated Quantum Chemistry with CP2K


“Learn how we achieve great GPU performance with an auto-tuned sparse matrix multiplication library, enabling quantum simulation of millions of electrons.”

Video: High Performance Computing with Python


“This talk will discuss various strategies to make a serial Python code faster, for example using libraries like NumPy, or tools like Cython which compile Python code. The talk will also discuss the available tools for running Python in parallel, focusing on the mpi4py module which implements MPI (Message Passing Interface) in Python.”

Comparing OpenACC and OpenMP Performance and Programmability


“OpenACC and OpenMP provide programmers with two good options for portable, high-level parallel programming for GPUs. This talk will discuss similarities and differences between the two specifications in terms of programmability, portability, and performance.”

UPC and OpenSHMEM PGAS Models on GPU Clusters

DK Panda, Ohio State University

“Learn about extensions that enable efficient use of Partitioned Global Address Space (PGAS) Models like OpenSHMEM and UPC on supercomputing clusters with NVIDIA GPUs. PGAS models are gaining attention for providing shared memory abstractions that make it easy to develop applications with dynamic and irregular communication patterns. However, the existing UPC and OpenSHMEM standards do not allow communication calls to be made directly on GPU device memory. This talk discusses simple extensions to the OpenSHMEM and UPC models to address this issue.”

Deep Learning at Scale


“We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images.”

Video: A Bioinformatics Pipeline for Analyzing Patient Tumours


In this video from WestGrid in Canada, Dr. Yussanne Ma from the Michael Smith Genome Sciences Centre describes how high performance computing supports her research group’s work, highlighting a recent project where a bioinformatics pipeline was built for the personalized onco-genomics project (POG) at the BC Cancer Agency.

Video: Accelerating OpenPOWER Using NVM Express SSDs and CAPI


“We present results for a platform consisting of an NVM Express SSD, a CAPI accelerator card and a software stack running on a Power8 system. We show how the threading of the Power8 CPU can be used to move data from the SSD to the CAPI card at very high speeds and implement accelerator functions inside the CAPI card that can process the data at these speeds.”

Video: Enabling OpenACC Performance Analysis


Learn how OpenACC runtimes also exposes performance-related information revealing where your OpenACC applications are wasting clock cycles. The talk will show that profilers can connect with OpenACC applications to record how much time is spent in OpenACC regions and what device activity it turns into.

Achieving Near-Native GPU Performance in the Cloud

John Paul Walters

“In this session we describe how GPUs can be used within virtual environments with near-native performance. We begin by showing GPU performance across four hypervisors: VMWare ESXi, KVM, Xen, and LXC. After showing that performance characteristics of each platform, we extend the results to the multi-node case with nodes interconnected by QDR InfiniBand. We demonstrate multi-node GPU performance using GPUDirect-enabled MPI, achieving efficiencies of 97-99% of a non-virtualized system.”