Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Improving Deep Learning scalability on HPE servers with NovuMind: GPU RDMA made easy

Bruno Monnet from HPE gave this talk at the NVIDIA GPU Technology Conference. “Deep Learning demands massive amounts of computational power. Those computation power usually involve heterogeneous computation resources, e.g., GPUs and InfiniBand as installed on HPE Apollo. NovuForce deep learning softwares within the docker image has been optimized for the latest technology like NVIDIA Pascal GPU and infiniband GPUDirect RDMA. This flexibility of the software, combined with the broad GPU servers in HPE portfolio, makes one of the most efficient and scalable solutions.”

Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems

DK Panda from Ohio State University gave this talk at the Swiss HPC Conference. “This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models. For the Deep Learning domain, we will focus on popular Deep Learning frameworks (Caffe, CNTK, and TensorFlow) to extract performance and scalability with MVAPICH2-GDR MPI library.”

Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and More

Erez Cohen from Mellanox gave this talk at the Swiss HPC Conference. “While InfiniBand, RDMA and GPU-Direct are an HPC mainstay, these advanced networking technologies are increasingly becoming a core differentiator to the data center. In fact, within just a few short years so far, where only a handful of bleeding edge industrial leaders emulated classic HPC disciplines, today almost every commercial market is usurping HPC technologies and disciplines in mass.”

The OpenFabrics Alliance 2018 Annual Workshop Recap

“The 14th Annual OpenFabrics Alliance (OFA) Workshop, held in scenic Boulder, Colorado, recently concluded its week-long, community-wide collaboration and dialogue on OpenFabrics. As the premier means of fostering lively discussions among those who develop fabrics, deploy fabrics, and create applications that rely on fabrics, the Workshop is the ideal venue for the OpenFabrics community and networking industry at large to identify and address the wide variety of emerging industry requirements and challenges that remain.”

HPE Deploys “Genius” Supercomputer at KU Leuven

Today HPE announced a new supercomputer installation at KU Leuven, a Flemish research university consistently ranked as one of the five most innovative universities in the world. HPE collaborated with KU Leuven to develop and deploy Genius, a new supercomputer built to run artificial intelligence (AI) workloads. The system will be available to both academia and the industry to build applications that drive scientific breakthroughs, economic growth and innovation in Flanders, the northern region of Belgium.

E8 Storage steps up to HPC with InfiniBand Support

Today E8 Storage announced availability of InfiniBand support to its high performance, NVMe storage solutions. The move comes as a direct response to HPC customers that wish to take advantage of the high speed, low latency throughput of InfiniBand for their data hungry applications. E8 Storage support for InfiniBand will be seamless for customers who now have the flexibility to connect via Ethernet or InfiniBand when paired with Mellanox ConnectX InfiniBand/VPI adapters. “Today we demonstrate once again that E8 Storage’s architecture can expand, evolve and always extract the full potential of flash performance,” comments Zivan Ori, co-founder and CEO of E8 Storage. “Partnering with market leaders like Mellanox that deliver the very best network connectivity technology ensures we continue to meet and, frequently, exceed the needs of our HPC customers even in their most demanding environments.”

Ceph on the Brain: Storage and Data-Movement Supporting the Human Brain Project

Adrian Tate from Cray and Stig Telfer from StackHPC gave this talk at the 2018 Swiss HPC Conference. “This talk will describe how Cray, StackHPC and the HBP co-designed a next-generation storage system based on Ceph, exploiting complex memory hierarchies and enabling next-generation mixed workload execution. We will describe the challenges, show performance data and detail the ways that a similar storage setup may be used in HPC systems of the future.”

Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lensing Software

Gilles Fourestey from EPFL gave this talk at the Swiss HPC Conference. “LENSTOOL is a gravitational lensing software that models mass distribution of galaxies and clusters. It is used to obtain sub-percent precision measurements of the total mass in galaxy clusters and constrain the dark matter self-interaction cross-section, a crucial ingredient to understanding its nature.”

Shifter – Docker Containers for HPC

Alberto Madonaa gave this talk at the Swiss HPC Conference. “In this work we present an extension to the container runtime of Shifter that provides containerized applications with a mechanism to access GPU accelerators and specialized networking from the host system, effectively enabling performance portability of containers across HPC resources. The presented extension makes possible to rapidly deploy high-performance software on supercomputers from containerized applications that have been developed, built, and tested in non-HPC commodity hardware, e.g. the laptop or workstation of a researcher.”

Exploiting HPC Technologies for Accelerating Big Data Processing and Associated Deep Learning

DK Panda from Ohio State University gave this talk at the Swiss HPC Conference. “This talk will provide an overview of challenges in accelerating Hadoop, Spark, and Memcached on modern HPC clusters. An overview of RDMA-based designs for Hadoop (HDFS, MapReduce, RPC and HBase), Spark, Memcached, Swift, and Kafka using native RDMA support for InfiniBand and RoCE will be presented. Enhanced designs for these components to exploit NVM-based in-memory technology and parallel file systems (such as Lustre) will also be presented.”