Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Call for Sessions: OpenFabrics Alliance Workshop in March

The OpenFabrics Alliance (OFA) has published a Call for Sessions for its 16th annual OFA Workshop. “The OFA Workshop 2020 Call for Sessions encourages industry experts and thought leaders to help shape this year’s discussions by presenting or leading discussions on critical high-performance networking issues. Session proposals are being solicited in any area related to high performance networks and networking software, with a special emphasis on the topics for this year’s Workshop. In keeping with the Workshop’s emphasis on collaboration, proposals for Birds of a Feather sessions and panels are particularly encouraged.”

How we built Oracle Cloud Infrastructure for HPC

In this video, Karan Batta from Oracle describes how the company built Oracle Cloud Infrastructure to deliver high performance for HPC applications. “Over the last 12 months, we have invested significantly, in both technology and partnerships, to make Oracle Cloud Infrastructure the best place to run your Big Compute and HPC workloads.”

Building Oracle Cloud Infrastructure with Bare-Metal

In this video, Taylor Newill from Oracle describes how the Oracle Cloud Infrastructure delivers high performance for HPC applications. “From the beginning, Oracle built their bare-metal cloud with a simple goal in mind: deliver the same performance in the cloud that clients are seeing on-prem.”

Designing Scalable HPC, Deep Learning, Big Data, and Cloud Middleware for Exascale Systems

DK Panda from Ohio State University gave this talk at the UK HPC Conference. “This talk will focus on challenges in designing HPC, Deep Learning, Big Data and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss about the challenges in designing runtime environments for MPI+X (PGAS – OpenSHMEM/UPC/CAF/UPC++, OpenMP, and CUDA) programming models by taking into account support for multi-core systems (Xeon, ARM and OpenPower), high-performance networks, and GPGPUs (including GPUDirect RDMA).”

Mellanox Powers Virtualized Machine Learning with VMware and NVIDIA

Today Mellanox announced that its RDMA (Remote Direct Memory Access) networking solutions for VMware vSphere enable virtualized Machine Learning solutions that achieve higher GPU utilization and efficiency. “As Moore’s Law has slowed, traditional CPU and networking technologies are no longer sufficient to support the emerging machine learning workloads,” said Kevin Deierling, vice president marketing, Mellanox Technologies. “Using hardware compute accelerators such as NVIDIA T4 GPUs and Mellanox’s RDMA networking solutions has proven to boost application performance in virtualized deployments.”

RDMA, Scalable MPI-3 RMA, and Next-Generation Post-RDMA Interconnects

Torsten Hoefler from ETH Zurich gave this talk at the Swiss HPC Conference. “Network cards contain rather powerful processors optimized for data movement and limiting the functionality to remote direct memory access seems unnecessarily constraining. We develop sPIN, a portable programming model to offload simple packet processing functions to the network card.”

Agenda Posted: Exacomm 2019 Workshop at ISC High Performance

“The goal of this workshop is to bring together researchers and software/hardware designers from academia, industry and national laboratories who are involved in creating network-based computing solutions for extreme scale architectures. The objectives of this workshop will be to share the experiences of the members of this community and to learn the opportunities and challenges in the design trends for exascale communication architectures.”

HPC Breaks Through to the Cloud: Why It Matters

In this special guest feature, Scot Schultz from Mellanox writes researchers are benefitting in a big way from HPC in the Cloud. “HPC has many different advantages depending on the specific use case, but one aspect that these implementations have in common is their use of RDMA-based fabrics to improve compute performance and reduce latency.”

Faster Fabrics Running Against Limits of the Operating System, the Processor, and the I/O Bus

Christopher Lameter from Jump Trading gave this talk at the OpenFabrics Workshop in Austin. “In 2017 we got 100G fabrics, in 2018 200G fabrics and in 2019 it looks like 400G technology may be seeing a considerable amount of adoption. These bandwidth compete with and sometimes are higher than the internal bus speeds of the servers that are connected using these fabrics. I think we need to consider these developments and work on improving fabrics and the associated APIs so that ways to access these features become possible using vendor neutral APIs. It needs to be possible to code in a portable way and not to a vendor specific one.”

Accelerating TensorFlow with RDMA for High-Performance Deep Learning

Xiaoyi Lu from Ohio State University gave this talk at the 2019 OpenFabrics Workshop in Austin. “Google’s TensorFlow is one of the most popular Deep Learning (DL) frameworks. We propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMAgRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance.”