Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


UKRI Awards ARCHER2 Supercomputer Services Contract

UKRI has awarded contracts to run elements of the next national supercomputer, ARCHER2, which will represent a significant step forward in capability for the UK’s science community. ARCHER2 is provided by UKRI, EPCC, Cray (an HPE company) and the University of Edinburgh. “ARCHER2 will be a Cray Shasta system with an estimated peak performance of 28 PFLOP/s. The machine will have 5,848 compute nodes, each with dual AMD EPYC Zen2 (Rome) 64 core CPUs at 2.2GHz, giving 748,544 cores in total and 1.57 PBytes of total system memory.”

Distributed HPC Applications with Unprivileged Containers

Felix Abecassis and Jonathan Calmels from NVIDIA gave this talk at FOSDEM 2020. “We will present the challenges in doing distributed deep learning training at scale on shared heterogeneous infrastructure. At NVIDIA, we use containers extensively in our GPU clusters for both HPC and deep learning applications. We love containers for how they simplify software packaging and enable reproducibility without sacrificing performance.”

What to expect at SC19

In this special guest feature, Dr. Rosemary Francis gives her perspective on what to look for at SC19 conference next week in Denver. “There are always many questions circling the HPC market in the run up to Supercomputing. In 2019, the focus is even more focused on the cloud in previous years. Here are a few of the topics that could occupy your coffee queue conversations in Denver this year.”

Getting Smart About Slurm in the Cloud

This timely article from our friends over at Univa takes a look at how often the popular HPC workload manager Slurm (Simple Linux Utility for Resource Management) is used in the cloud. In a recent InsideHPC survey sponsored by Univa, all Slurm users surveyed reported using public cloud services to at least some degree.

Harvard Names New Lenovo HPC Cluster after Astronomer Annie Jump Cannon

Harvard has deployed a liquid-cooled supercomputer from Lenovo at it’s FASRC computing center. The system, named “Cannon” in honor of astronomer Annie Jump Cannon, is a large-scale HPC cluster supporting scientific modeling and simulation for thousands of Harvard researchers. “This new cluster will have 30,000 cores of Intel 8268 “Cascade Lake” processors. Each node will have 48 cores and 192 GB of RAM.”

Job of the Week: Software Developer/SLURM System Administrator at Adaptive Computing

Adaptive Computing is seeking a Software Developer/SLURM System Administrator in our Job of the Week. The position is located in the Naples, Florida. “This is an excellent opportunity to be part of the core team in a rapidly growing company. The company enjoys a rock-solidindustry reputation in HPC Workload Scheduling and Cloud Management Solutions.Adaptive Computing works with some of the largest commercial enterprises,government agencies, and academic institutions in the world.”

Univa Brings Navops Launch Multi-Cloud Automation to the Slurm Community

Today Univa announced enhancements to its powerful Navops Launch HPC cloud-automation platform to support the widely-used Slurm workload scheduler. “Univa’s support for open-source Slurm – installed at approximately 60% of the leading HPC centers – is an important milestone in the evolution of HPC cloud-automation,” said Srini Chari, founder and managing partner of Cabot Partners. “Navops Launch is one of the few HPC management tools that is both multi-cloud and multi-scheduler.”

Interview: Univa Steps Up with NAVOPS 2.0 for moving HPC Workloads to the Cloud

Today Univa announced the newest release of its popular Navops Launch cloud-automation platform. Navops Launch 2.0 delivers new capabilities to help simplify the migration of enterprise HPC workloads to the cloud, enabling enterprises to transition HPC workloads to the cloud while reducing costs by 30-40 percent. To learn more, we caught up with Univa CEO Gary Tyreman.

How SUSE Powers High Performance Computing

SUSE Linux Enterprise High Performance Computing provides a parallel computing platform for high performance data analytics workloads such as artificial intelligence and machine learning. Fueled by the need for more compute power and scale, businesses around the world today are recognizing that a high performance computing infrastructure is vital to supporting the analytics applications of tomorrow.

Podcast: HPC & AI Convergence Enables AI Workload Innovation

In this Conversations in the Cloud podcast, Esther Baldwin from Intel describes how the convergence of HPC and AI is driving innovation. “On the topic of HPC & AI converged clusters, there’s a perception that if you want to do AI, you must stand up a separate cluster, which Esther notes is not true. Existing HPC customers can do AI on their existing infrastructure with solutions like HPC & AI converged clusters.”