Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Checkpointing the Un-checkpointable: MANA and the Split-Process Approach

Gene Cooperman from Northeastern University gave this talk at the MVAPICH User Group. “This talk presents an efficient, new software architecture: split processes. The “MANA for MPI” software demonstrates this split-process architecture. The MPI application code resides in “upper-half memory”, and the MPI/network libraries reside in “lower-half memory”.

Video: Three Perspectives on Message Passing

Robert Harrison from Brookhaven gave this talk at the MVAPICH User Group. “MADNESS, TESSE/EPEXA, and MolSSI are three quite different large and long-lived projects that provide different perspectives and driving needs for the future of message passing. All three of these projects employ MPI and have a vested interest in computation at all scales, spanning the classroom to future exascale systems.”

Benchmarking MPI Applications in Singularity Containers on Traditional HPC and Cloud Infrastructures

Andrei Plamada from ETH Zurich gave this talk at the hpc-ch forum on Cloud and Containers. “Singularity is a container solution that promises to both integrate MPI applications seamlessly and run containers without privilege escalation. These benefits make Singularity a potentially good candidate for the scientific high-performance computing community. However, the performance overhead introduced by Singularity is unclear. In this work we will analyze the overhead and the user experience on both traditional HPC and cloud infrastructures.”

Converging Workflows Pushing Converged Software onto HPC Platforms

Are we witnessing the convergence of HPC, big data analytics, and AI? Once, these were separate domains, each with its own system architecture and software stack, but the data deluge is driving their convergence. Traditional big science HPC is looking more like big data analytics and AI, while analytics and AI are taking on the flavor of HPC.

Scalable and Distributed DNN Training on Modern HPC Systems

DK Panda from Ohio State University gave this talk at the Swiss HPC Conference. “We will provide an overview of interesting trends in DNN design and how cutting-edge hardware architectures are playing a key role in moving the field forward. We will also present an overview of different DNN architectures and DL frameworks. Most DL frameworks started with a single-node/single-GPU design.”

Call for Papers: EuroMPI Conference in Zurich

The EuroMPI conference has issued its Call for Papers. The event takes place September 10-13 in Zurich, Switzerland. “The EuroMPI conference is since 1994 the preeminent meeting for users, developers and researchers to interact and discuss new developments and applications of message-passing parallel computing, in particular in and related to the Message Passing Interface (MPI). This includes parallel programming interfaces, libraries and langauges, architectures, networks, algorithms, tools, applications, and High Performance Computing with particular focus on quality, portability, performance and scalability.”

How Mellanox SHARP technology speeds Ai workloads

Mellanox Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) technology improves upon the performance of MPI operations by offloading collective operations from the CPU to the switch network, and by eliminating the need to send data multiple times between endpoints. This innovative approach decreases the amount of data traversing the network as aggregation nodes are reached, and dramatically reduces the MPI operations time. Implementing collective communication algorithms in the network also has additional benefits, such as freeing up valuable CPU resources for computation rather than using them to process communication.”

Interview: The Importance of the Message Passing Interface to Supercomputing

In this video, Mike Bernhardt from the Exascale Computing Project catches up with ORNL’s David Bernholdt at SC18. They discuss supercomputing the conference, his career, the evolution and significance of message passing interface (MPI) in parallel computing, and how ECP has influenced his team’s efforts.

Podcast: Evolving MPI for Exascale Applications

In this episode of Let’s Talk Exascale, Pavan Balaji and Ken Raffenetti describe their efforts to help MPI, the de facto programming model for parallel computing, run as efficiently as possible on exascale systems. “We need to look at a lot of key technical challenges, like performance and scalability, when we go up to this scale of machines. Performance is one of the biggest things that people look at. Aspects with respect to heterogeneity become important.”

Amazon and Libfabric: A case study in flexible HPC Infrastructure

Brian Barrett from Amazon gave this talk at the 2018 OpenFabrics Workshop. “As network performance becomes a larger bottleneck in application performance, AWS is investing in improving HPC network performance. Our initial investment focused on improving performance in open source MPI implementations, with positive results. Recently, however, we have pivoted to focusing on using libfabric to improve point to point performance.”