Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Podcast: A Shift to Modern C++ Programming Models

In this Code Together podcast, Alice Chan from Intel and Hal Finkel from Argonne National Lab discuss how the industry is uniting to address the need for programming portability and performance across diverse architectures, particularly important with the rise of data-intensive workloads like artificial intelligence and machine learning. “We discuss the important shift to modern C++ programming models, and how the cross-industry oneAPI initiative, and DPC++, bring much-needed portable performance to today’s developers.”

How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems

DK Panda from Ohio State University gave this talk at the Stanford HPC Conference. “This talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented.”

Appentra raises €1.8M for its Parallelware Analyzer software

Today Appentra announced that the company has raised €1.8M in new funding in a round led by Armilar Venture Partners and K Fund. Appentra is a Deep Tech global company that delivers products based on the Parallelware technology, a unique approach to static code analysis specialized in parallelism. Our aim is to make parallel programming easier, enabling everyone to make the best use of parallel computing hardware from the multi-cores in a laptop to the fastest supercomputers. “During the last months we have been working hand in hand with the new investors, and we are proud to say that they will definitely bring in a strong expertise as international VCs specialized in B2B software companies, which will help us to fully realize Appentra’s vision.”

Video: Preparing to program Aurora at Exascale – Early experiences and future directions

Hal Finkel from Argonne gave this talk at IWOCL / SYCLcon 2020. “Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date.”

Podcast: Spack Helps Automate Deployment of Supercomputer Software

In this Let’s Talk Exascale podcast, Todd Gamblin from LLNL describes how the Spack flexible package manager helps automate the deployment of software on supercomputer systems. “After many hours building software on Lawrence Livermore’s supercomputers, in 2013 Todd Gamblin created the first prototype of a package manager he named Spack (Supercomputer PACKage manager). The tool caught on, and development became a grassroots effort as colleagues began to use the tool.”

Khronos Group Releases OpenCL 3.0

Today, the Khronos Group consortium released the OpenCL 3.0 Provisional Specifications. OpenCL 3.0 realigns the OpenCL roadmap to enable developer-requested functionality to be broadly deployed by hardware vendors, and it significantly increases deployment flexibility by empowering conformant OpenCL implementations to focus on functionality relevant to their target markets. Many of our customers want a GPU programming language that runs on all devices, and with growing deployment in edge computing and mobile, this need is increasing,” said Vincent Hindriksen, founder and CEO of Stream HPC. “OpenCL is the only solution for accessing diverse silicon acceleration and many key software stacks use OpenCL/SPIR-V as a backend. We are very happy that OpenCL 3.0 will drive even wider industry adoption, as it reassures our customers that their past and future investments in OpenCL are justified.”

EPEEC Project Fosters Heterogeneous HPC Programming in Europe

The European Programming Environment for Programming Productivity of Heterogeneous Supercomputers (EPEEC) is a project that aims to combine European made tools for programming models and performance tools that could help to relieve the burden of targeting highly-heterogeneous supercomputers. It is hoped that this project will make researchers jobs easier as they can more effectively use large scale HPC systems.

Job of the Week: Senior Level Unix/Linux Systems Engineer at Lockheed Martin Space Systems

Lockheed Martin Space is seeking a Senior Level Unix/Linux Systems Engineer in our Job of the Week. “The coolest jobs on this planet… or any other… are with Lockheed Martin Space. We are seeking a Multi-Level Security Subject Matter Expert for Unix/Linux/SE Linux/HPC System Engineers-Developers-Administrators and Research and Development HPC program team.”

Video: Energy Efficient Computing using Dynamic Tuning

Lubomir Riha from IT4Innovations gave this talk as part of the POP HPC webinar series. “This webinar focused on tools designed to improve the energy-efficiency of HPC applications using a methodology of dynamic tuning of HPC applications, developed under the H2020 READEX project. The READEX methodology has been designed for exploiting the dynamic behaviour of software. At design time, different runtime situations (RTS) are detected and optimized system configurations are determined. RTSs with the same configuration are grouped into scenarios, forming the tuning model. At runtime, the tuning model is used to switch system configurations dynamically.”

Video: Profiling Python Workloads with Intel VTune Amplifier

Paulius Velesko from Intel gave this talk at the ALCF Many-Core Developer Sessions. “This talk covers efficient profiling techniques that can help to dramatically improve the performance of code by identifying CPU and memory bottlenecks. Efficient profiling techniques can help dramatically improve the performance of code by identifying CPU and memory bottlenecks. We will demonstrate how to profile a Python application using Intel VTune Amplifier, a full-featured profiling tool.”