Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Articles and news on parallel programming and code modernization

Georgia Tech’s Vivek Sarkar Wins 2020 ACM-IEEE CS Ken Kennedy Award

The Association for Computing Machinery (ACM) and IEEE Computer Society (IEEE CS) have named Vivek Sarkar of Georgia Institute of Technology winner of the 2020 ACM/IEEE CS Ken Kennedy Award. Sarkar is recognized for “foundational technical contributions to the area of programmability and productivity in parallel computing, as well as leadership contributions to professional service, mentoring, […]

The Hyperion-insideHPC Interviews: Irene Qualters’ Long View of HPC, from a Start-up Called Cray to Today’s ‘No-Analog’ Research at Los Alamos

Irene Qualters, a senior-level manager at Los Alamos National Laboratory, has been at the forefront of the convergence of supercomputing and science for decades, extending back to joining Cray as one of that company’s first 100 employees. Few members of the HPC community can match her wealth of experience and wisdom regarding the future of scientific computing and its “no-analog” physics-informed AI exploration of problems confronting our planet, such as climate change.

2020 OpenFabrics Alliance Workshop – Video Gallery

Welcome to the 2020 OpenFabrics Workshop video gallery. The OpenFabrics Alliance (OFA) is focused on accelerating development of high performance fabrics. The annual OFA Workshop, held in virtual format this year, is a premier means of fostering collaboration among those who develop fabrics, deploy fabrics, and create applications that rely on fabrics. It is the […]

Google Unveils 1st Public Cloud VMs using Nvidia Ampere A100 Tensor GPUs

Google today introduced the Accelerator-Optimized VM (A2) instance family on Google Compute Engine based on the NVIDIA Ampere A100 Tensor Core GPU, launched in mid-May. Available in alpha and with up to 16 GPUs, A2 VMs are the first A100-based offering in a public cloud, according to Google. At its launch, Nvidia said the A100, built on the company’s new Ampere architecture, delivers “the greatest generational leap ever,” according to Nvidia, enhancing training and inference computing performance by 20x over its predecessors.

SeRC Turns to oneAPI Multi-Chip Programming Model for Accelerated Research

At ISC 2020 Digital, the Swedish e-Science Research Center (SeRC), Stockholm, has announced plans to use Intel’s oneAPI unified programming language by researchers conducting  massive simulations powered by CPUs and GPUs. The center said it chose the oneAPI programming model, designed to span CPUs, GPUs, FPGAs and other architectures and silicon,  to accelerate compute for research using GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics software, developed by SeRC and first released in 1991

‘Rocky Year’ – Hyperion’s HPC Market Update: COVID-19 Hits Q1 Revenues, Cloud HPC Boom, Shift in Server Vendor Standings

Instead of its usual mid-year HPC market update presented at the ISC conference in Frankfurt, industry analyst firm Hyperion Research has virtually released its latest findings – including estimates of COVID-19 ‘s impact on the industry, on growth of HPC in public clouds and a significant shift in the competitive standing among the leading HPC server vendors. Taking 2019 in total, Hyperion sized the HPC server market at $13.7 billion, record revenues

New NVIDIA DGX A100 Packs Record 5 Petaflops of AI Performance for Training, Inference, and Data Analytics

Today NVIDIA unveiled the NVIDIA DGX A100 AI system, delivering 5 petaflops of AI performance and consolidating the power and capabilities of an entire data center into a single flexible platform. “DGX A100 systems integrate eight of the new NVIDIA A100 Tensor Core GPUs, providing 320GB of memory for training the largest AI datasets, and the latest high-speed NVIDIA Mellanox HDR 200Gbps interconnects.”

Podcast: A Shift to Modern C++ Programming Models

In this Code Together podcast, Alice Chan from Intel and Hal Finkel from Argonne National Lab discuss how the industry is uniting to address the need for programming portability and performance across diverse architectures, particularly important with the rise of data-intensive workloads like artificial intelligence and machine learning. “We discuss the important shift to modern C++ programming models, and how the cross-industry oneAPI initiative, and DPC++, bring much-needed portable performance to today’s developers.”

Video: Profiling Python Workloads with Intel VTune Amplifier

Paulius Velesko from Intel gave this talk at the ALCF Many-Core Developer Sessions. “This talk covers efficient profiling techniques that can help to dramatically improve the performance of code by identifying CPU and memory bottlenecks. Efficient profiling techniques can help dramatically improve the performance of code by identifying CPU and memory bottlenecks. We will demonstrate how to profile a Python application using Intel VTune Amplifier, a full-featured profiling tool.”

Breaking Boundaries with Data Parallel C++

“There’s a new programming language in town. Called Data Parallel C++ (DPC++), it allows developers to reuse code across diverse hardware targets—CPUs and accelerators—and perform custom tuning for a specific accelerator. DPC++ is part of oneAPI—an Intel-led initiative to create a unified programming model for cross-architecture development. Based on familiar C++ and SYCL, DPC++ is an open alternative to single-architecture proprietary approaches and helps developers create solutions that better meet specialized workload requirements.”