Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Managing Node Configuration with 1000s of Nodes

Ira Weiny from Intel presented this talk at the OpenFabrics Workshop. “Individual node configuration when managing 1000s or 10s of thousands of nodes in a cluster can be a daunting challenge. Two key daemons are now part of the rdma-core package which aid the management of individual nodes in a large fabric: IBACM and rdma-ndd.”

Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV IB Clusters

Xiaoyi Lu from Ohio State University presented this talk at the Open Fabrics Workshop. “Single Root I/O Virtualization (SR-IOV) technology has been steadily gaining momentum for high performance interconnects such as InfiniBand. SR-IOV can deliver near native performance but lacks locality-aware communication support. This talk presents an efficient approach to building HPC clouds based on MVAPICH2 and RDMA-Hadoop with SR-IOV.”

Experiences with NVMe over Fabrics

“Using RDMA, NVMe over Fabrics (NVMe-oF) provides the high BW and low-latency characteristics of NVMe to remote devices. Moreover, these performance traits are delivered with negligible CPU overhead as the bulk of the data transfer is conducted by RDMA. In this session, we present an overview of NVMe-oF and its implementation in Linux. We point out the main design choices and evaluate NVMe-oF performance for both Infiniband and RoCE fabrics.”

Video: RDMA on ARM

Pavel Shamis from ARM Research presented this talk at the OpenFabrics Workshop. “With the emerging availability server platforms based on ARM CPU architecture, it is important to understand ARM integrates with RDMA hardware and software eco-system. In this talk, we will overview ARM architecture and system software stack. We will discuss how ARM CPU interacts with network devices and accelerators. In addition, we will share our experience in enabling RDMA software stack (OFED/MOFED Verbs) and one-sided communication libraries (Open UCX, OpenSHMEM/SHMEM) on ARM and share preliminary evaluation results.”

Designing HPC & Deep Learning Middleware for Exascale Systems

DK Panda from Ohio State University presented this deck at the 2017 HPC Advisory Council Stanford Conference. “This talk will focus on challenges in designing runtime environments for exascale systems with millions of processors and accelerators to support various programming models. We will focus on MPI, PGAS (OpenSHMEM, CAF, UPC and UPC++) and Hybrid MPI+PGAS programming models by taking into account support for multi-core, high-performance networks, accelerators (GPGPUs and Intel MIC), virtualization technologies (KVM, Docker, and Singularity), and energy-awareness. Features and sample performance numbers from the MVAPICH2 libraries will be presented.”

GIGABYTE Selects Cavium QLogic FastLinQ Ethernet Solutions

“GIGABYTE servers – across standard, Open Compute Platform (OCP) and rack scale form factors – deliver exceptional value, performance and scalability for multi-tenant cloud and virtualized enterprise datacenters,” said Etay Lee, GM of GIGABYTE Technology’s Server Division. “The addition of QLogic 10GbE and 25GbE FastLinQ Ethernet NICs in OCP and Standard form factors will enable delivery on all of the tenets of open standards, while enabling key virtualization technologies like SR-IOV and full offloads for overlay networks using VxLAN, NVGRE and GENEVE.”

OpenFabrics Alliance Workshop 2017 – Call for Sessions Open

Each year the OpenFabrics Alliance (OFA) hosts an annual workshop devoted to advancing the state of the art in networking. “One secret to the enduring success of the workshop is the OFA’s emphasis on hosting an interactive, community-driven event. To continue that trend, we are once again reaching out to the community to create a rich program that addresses topics important to the networking industry. We’re looking for proposals for workshop sessions.”

Mellanox Ethernet Accelerates Baidu Machine Learning

Today Mellanox announced that Spectrum Ethernet switches and ConnectX-4 100Gb/s Ethernet adapters have been selected by Baidu, the leading Chinese language Internet search provider, for Baidu’s Machine Learning platforms. The need for higher data speed and most efficient data movement placed Spectrum and RDMA-enabled ConnectX-4 adapters as key components to enable world leading machine learning […]

Video: Azure High Performance Computing

“Run your Windows and Linux HPC applications using high performance A8 and A9 compute instances on Azure, and take advantage of a backend network with MPI latency under 3 microseconds and non-blocking 32 Gbps throughput. This backend network includes remote direct memory access (RDMA) technology on Windows and Linux that enables parallel applications to scale to thousands of cores. Azure provides you with high memory and HPC-class CPUs to help you get results fast. Scale up and down based upon what you need and pay only for what you use to reduce costs.”

Call for Papers: International Workshop on High-Performance Big Data Computing (HPBDC)

The 3rd annual International Workshop on High-Performance Big Data Computing (HPBDC) has issued its Call for Papers. Featuring a keynote by Prof. Satoshi Matsuoka from Tokyo Institute of Technology, the event takes place May 29, 2017 in Orlando, FL.