Designating the appropriate provider for large MPI applications is critical to taking advantage of all of the compute power available. “A modern HPC system with multiple host cpus and multiple coprocessors such as the Intel Xeon Phi coprocessor housed in numerous racks can be optimized for maximum application performance with intelligent thread placement.”
“This talk will introduce these three debugging techniques and provide some suggestions on selecting the optimal approach for a variety of debugging scenarios such as hangs, numerical errors, and crashes. Specific examples will be given using the TotalView debugger but the concepts covered may apply to other debugging tools such as GDB and the NVIDIA NSIGHT debugger.”
In this video, Rick Leinecker from Slashdot Media reviews the beta version of Intel Parallel Studio XE 2016. Leinecker describes several of the notable features and updates, including OpenMP enhancements, vastly improved computer vision and image processing, and the Data Analytics Acceleration Library.
Today ArrayFire announced the release of Version 3.0 of their high-speed software library for GPU computing. The new version features major changes to ArrayFire’s visualization library, a new CPU backend, and dense linear algebra for OpenCL devices. It also includes improvements across the board for ArrayFire’s OpenCL backend.
“This talk will discuss various strategies to make a serial Python code faster, for example using libraries like NumPy, or tools like Cython which compile Python code. The talk will also discuss the available tools for running Python in parallel, focusing on the mpi4py module which implements MPI (Message Passing Interface) in Python.”