Agenda Posted: Exacomm 2019 Workshop at ISC High Performance

“The goal of this workshop is to bring together researchers and software/hardware designers from academia, industry and national laboratories who are involved in creating network-based computing solutions for extreme scale architectures. The objectives of this workshop will be to share the experiences of the members of this community and to learn the opportunities and challenges in the design trends for exascale communication architectures.”

HPC Breaks Through to the Cloud: Why It Matters

In this special guest feature, Scot Schultz from Mellanox writes researchers are benefitting in a big way from HPC in the Cloud. “HPC has many different advantages depending on the specific use case, but one aspect that these implementations have in common is their use of RDMA-based fabrics to improve compute performance and reduce latency.”

Faster Fabrics Running Against Limits of the Operating System, the Processor, and the I/O Bus

Christopher Lameter from Jump Trading gave this talk at the OpenFabrics Workshop in Austin. “In 2017 we got 100G fabrics, in 2018 200G fabrics and in 2019 it looks like 400G technology may be seeing a considerable amount of adoption. These bandwidth compete with and sometimes are higher than the internal bus speeds of the servers that are connected using these fabrics. I think we need to consider these developments and work on improving fabrics and the associated APIs so that ways to access these features become possible using vendor neutral APIs. It needs to be possible to code in a portable way and not to a vendor specific one.”

Accelerating TensorFlow with RDMA for High-Performance Deep Learning

Xiaoyi Lu from Ohio State University gave this talk at the 2019 OpenFabrics Workshop in Austin. “Google’s TensorFlow is one of the most popular Deep Learning (DL) frameworks. We propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMAgRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance.”

Mellanox HDR 200G InfiniBand Speeds Machine Learning with NVIDIA

Today Mellanox announced that its HDR 200G InfiniBand with the “Scalable Hierarchical Aggregation and Reduction Protocol” (SHARP) technology has set new performance records, doubling deep learning operations performance. The combination of Mellanox In-Network Computing SHARP with NVIDIA 100 Tensor Core GPU technology and Collective Communications Library (NCCL) deliver leading efficiency and scalability to deep learning and artificial intelligence applications.

Video: Why InfiniBand is the Way Forward for Ai and Exascale

In this video, Gilad Shainer from the InfiniBand Trade Association describes how InfiniBand offers the optimal interconnect technology for Ai, HPC, and Exascale. “Tthrough Ai, you need the biggest pipes in order to move those giant amount of data in order to create those Ai software algorithms. That’s one thing. Latency is important because you need to drive things faster. RDMA is one of the key technology that enables to increase the efficiency of moving data, reducing CPU overhead. And by the way, now, there’s all of the Ai frameworks that exist out there, supports RDMA as a default element within the framework itself.”

How to Design Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems

DK Panda from Ohio State University gave this talk at the Stanford HPC Conference. “This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models taking into account support for multi-core systems (Xeon, OpenPower, and ARM), high-performance networks, GPGPUs (including GPUDirect RDMA), and energy-awareness.”

The State of High-Performance Fabrics: A Chat with the OpenFabrics Alliance

In this special guest feature, Paul Grun and Doug Ledford from the OpenFabrics Alliance describe the industry trends in the fabrics space, its state of affairs and emerging applications. “Originally, ‘high-performance fabrics’ were associated with large, exotic HPC machines. But in the modern world, these fabrics, which are based on technologies designed to improve application efficiency, performance, and scalability, are becoming more and more common in the commercial sphere because of the increasing demands being placed on commercial systems.”

Agenda Posted for ExaComm 2018 Workshop in Frankfurt

The ExaComm 2018 workshop has posted their Speaker Agenda. Held in conjunction with ISC 2018, the Fourth International Workshop on Communication Architectures for HPC, Big Data, Deep Learning and Clouds at Extreme Scale takes place June 28 in Frankfurt. ” The goal of this workshop is to bring together researchers and software/hardware designers from academia, industry and national laboratories who are involved in creating network-based computing solutions for extreme scale architectures. The objectives of this workshop will be to share the experiences of the members of this community and to learn the opportunities and challenges in the design trends for exascale communication architectures.”

Improving Deep Learning scalability on HPE servers with NovuMind: GPU RDMA made easy

Bruno Monnet from HPE gave this talk at the NVIDIA GPU Technology Conference. “Deep Learning demands massive amounts of computational power. Those computation power usually involve heterogeneous computation resources, e.g., GPUs and InfiniBand as installed on HPE Apollo. NovuForce deep learning softwares within the docker image has been optimized for the latest technology like NVIDIA Pascal GPU and infiniband GPUDirect RDMA. This flexibility of the software, combined with the broad GPU servers in HPE portfolio, makes one of the most efficient and scalable solutions.”