In this slidecast, Pavel Shamis from ORNL and Gilad Shainer from Mellanox announce the UCX Unified Communication X Framework. “UCX is a collaboration between industry, laboratories, and academia to create an open-source production grade communication framework for data centric and HPC applications.”
PNNL researchers are using supercomputers to take on two of the main challenges of exascale: energy efficiency and resiliency. Their simulations show that dynamic voltage scaling, also known as undervolting, can reduce power consumption and leverage existing mainstream resilience techniques at scale for improving system failure rates.
Today ISC announced that a research paper in the area of in-memory architecture, jointly submitted by a team of seven researchers representing the Jülich Supercomputing Centre (JSC), IBM Germany, and the IBM Watson Research Center in the US, has been selected to receive the inaugural Hans Meuer Award.
“The drive toward exascale computing, a renewed emphasis on data-centric processing, energy efficiency concerns, and the limitations of memory and I/O performance are all working to reshape High Performance Computing platforms. Many-core accelerators, flash storage, 3D memory, integrated networking, and optical interconnects are just some of the technologies propelling these future architectures. In concert with those developments, the HPC vendor landscape has been churning in response to broader market forces, and these events are going to drive some interesting changes in the coming year.”
“This talk will focus on programming models and their designs for upcoming exascale systems with millions of processors and accelerators. Current status and future trends of MPI and PGAS (UPC and OpenSHMEM) programming models will be presented. We will discuss challenges in designing runtime environments for these programming models by taking into account support for multi-core, high-performance networks, GPGPUs, Intel MIC, scalable collectives (multi-core-aware, topology-aware, and power-aware), non-blocking collectives using Offload framework, one-sided RMA operations, schemes and architectures for fault-tolerance/fault-resilience.”