The performance of distributed memory MPI applications on the latest highly parallel multi-core processors often turns out to be lower than expected. Which is why hybrid applications using OpenMP multithreading on each node and MPI across nodes in a cluster are becoming more common. This sponsored post from Intel, written by Richard Friedman, depicts how to boost performance for hybrid applications with multiple endpoints in the Intel MPI Library.