Solving many scientific and technical applications entails the use of matrix multiplies somewhere in the algorithm and thus the computer code. With today’s multicore CPUs, proper use of complier directives can speed up matrix multiplies significantly.
In this video from the 2015 HPC Advisory Council Switzerland Conference, Oded Paz presents: Special Training Session for HPC Systems Managers and Users: EDR InfiniBand, Multicast Operations (setup flow and diagnostic tools), Traffic Load Balancing, InfiniBand Quality Of Service, System Debugging, and open Q&A.
“As we see Moore’s Law alive and well, more and more parallelism is introduced into all computing platforms and on all levels of integration and programming to achieve higher performance and energy efficiency. We will discuss Multi- and Many-Core solutions for highly parallel workloads with general purpose and energy efficient technologies. We will also touch on the challenges and opportunities for parallel programming models, methodologies and software tools to achieve highly efficient and highly productive parallel applications. At the end we will take a brief look towards Exascale computing.”
“This talk will focus on programming models and their designs for upcoming exascale systems with millions of processors and accelerators. Current status and future trends of MPI and PGAS (UPC and OpenSHMEM) programming models will be presented. We will discuss challenges in designing runtime environments for these programming models by taking into account support for multi-core, high-performance networks, GPGPUs, Intel MIC, scalable collectives (multi-core-aware, topology-aware, and power-aware), non-blocking collectives using Offload framework, one-sided RMA operations, schemes and architectures for fault-tolerance/fault-resilience.”