Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Designing HPC & Deep Learning Middleware for Exascale Systems

DK Panda from Ohio State University presented this deck at the 2017 HPC Advisory Council Stanford Conference. “This talk will focus on challenges in designing runtime environments for exascale systems with millions of processors and accelerators to support various programming models. We will focus on MPI, PGAS (OpenSHMEM, CAF, UPC and UPC++) and Hybrid MPI+PGAS programming models by taking into account support for multi-core, high-performance networks, accelerators (GPGPUs and Intel MIC), virtualization technologies (KVM, Docker, and Singularity), and energy-awareness. Features and sample performance numbers from the MVAPICH2 libraries will be presented.”

GIGABYTE Selects Cavium QLogic FastLinQ Ethernet Solutions

“GIGABYTE servers – across standard, Open Compute Platform (OCP) and rack scale form factors – deliver exceptional value, performance and scalability for multi-tenant cloud and virtualized enterprise datacenters,” said Etay Lee, GM of GIGABYTE Technology’s Server Division. “The addition of QLogic 10GbE and 25GbE FastLinQ Ethernet NICs in OCP and Standard form factors will enable delivery on all of the tenets of open standards, while enabling key virtualization technologies like SR-IOV and full offloads for overlay networks using VxLAN, NVGRE and GENEVE.”

OpenFabrics Alliance Workshop 2017 – Call for Sessions Open

Each year the OpenFabrics Alliance (OFA) hosts an annual workshop devoted to advancing the state of the art in networking. “One secret to the enduring success of the workshop is the OFA’s emphasis on hosting an interactive, community-driven event. To continue that trend, we are once again reaching out to the community to create a rich program that addresses topics important to the networking industry. We’re looking for proposals for workshop sessions.”

Mellanox Ethernet Accelerates Baidu Machine Learning

Today Mellanox announced that Spectrum Ethernet switches and ConnectX-4 100Gb/s Ethernet adapters have been selected by Baidu, the leading Chinese language Internet search provider, for Baidu’s Machine Learning platforms. The need for higher data speed and most efficient data movement placed Spectrum and RDMA-enabled ConnectX-4 adapters as key components to enable world leading machine learning […]

Video: Azure High Performance Computing

“Run your Windows and Linux HPC applications using high performance A8 and A9 compute instances on Azure, and take advantage of a backend network with MPI latency under 3 microseconds and non-blocking 32 Gbps throughput. This backend network includes remote direct memory access (RDMA) technology on Windows and Linux that enables parallel applications to scale to thousands of cores. Azure provides you with high memory and HPC-class CPUs to help you get results fast. Scale up and down based upon what you need and pay only for what you use to reduce costs.”

Call for Papers: International Workshop on High-Performance Big Data Computing (HPBDC)

The 3rd annual International Workshop on High-Performance Big Data Computing (HPBDC) has issued its Call for Papers. Featuring a keynote by Prof. Satoshi Matsuoka from Tokyo Institute of Technology, the event takes place May 29, 2017 in Orlando, FL.

HIP and CAFFE Porting and Profiling with AMD’s ROCm

In this video from SC16, Ben Sander from AMD presents: HIP and CAFFE Porting and Profiling with AMD’s ROCm. “We are excited to present ROCm, the first open-source HPC/Hyperscale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application. ROCm is built for scale; it supports multi-GPU computing in and out of server-node communication through RDMA.”

InfiniBand: When State-of-the-Art becomes State-of-the-Smart

Scot Schultz from Mellanox writes that the company is moving the industry forward to a world-class off-load network architecture that will pave the way to Exascale. “Mellanox, alongside many industry thought-leaders, is a leader in advancing the Co-Design approach. The key value and core goal is to strive for more CPU offload capabilities and acceleration techniques while maintaining forward and backward compatibility of new and existing infrastructures; and the result is nothing less than the world’s most advanced interconnect, which continues to yield the most powerful and efficient supercomputers ever deployed.”

Mellanox Shipping ConnectX-5 Adapter

“We are pleased to start shipping the ConnectX-5, the industry’s most advanced network adapter, to our key partners and customers, allowing them to leverage our smart network architecture to overcome performance limitations and to gain a competitive advantage,” said Eyal Waldman, Mellanox president and CEO. “ConnectX-5 enables our customers and partners to achieve higher performance, scalability and efficiency of their InfiniBand or Ethernet server and storage platforms. Our interconnect solutions, when combined with Intel, IBM, NVIDIA or ARM CPUs, allow users across the world to achieve significant better return on investment from their IT infrastructure.”

Supermicro Rolls Out New Servers with Tesla P100 GPUs

“Our high-performance computing solutions enable deep learning, engineering, and scientific fields to scale out their compute clusters to accelerate their most demanding workloads and achieve fastest time-to-results with maximum performance per watt, per square foot, and per dollar,” said Charles Liang, President and CEO of Supermicro. “With our latest innovations incorporating the new NVIDIA P100 processors in a performance and density optimized 1U and 4U architectures with NVLink, our customers can accelerate their applications and innovations to address the most complex real world problems.”