MailChimp Developer

Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Heterogeneous MPI Application Optimization


“Two components of ITAC, the Intel Trace Collector and the Intel Trace Analyzer can be used to understand the performance and bottlenecks of a Monte Carlo simulation. When each of the strike prices are distributed to both the Intel Xeon cores the Intel Xeon Phi coprocessor, the efficiency was about 79%, as the coprocessors can calculate the results much faster than the main CPU cores.”

New Intel® Omni-Path White Paper Details Technology Improvements

Rob Farber

The Intel Omni-Path Architecture (Intel® OPA) whitepaper goes through the multitude of improvements that Intel OPA technology provides to the HPC community. In particular, HPC readers will appreciate how collective operations can be optimized based on message size, collective communicator size and topology using the point-to-point send and receive primitives.

Titan Supercomputer Powers the Future of Forecasting


Knowing how the weather will behave in the near future is indispensable for countless human endeavors. Now, researchers at ECMWF are leveraging the computational power of the Titan supercomputer at Oak Ridge to improve weather forecasting.

Altair Launches PBS Pro 13

Altair Logo Stacked

Today Altair announced the general availability of PBS Professional 13.0, the latest version of the market-leading software product for high-performance computing workload management and job scheduling on clusters and supercomputers.

Computing With MPI in Heterogeneous Environments


Designating the appropriate provider for large MPI applications is critical to taking advantage of all of the compute power available. “A modern HPC system with multiple host cpus and multiple coprocessors such as the Intel Xeon Phi coprocessor housed in numerous racks can be optimized for maximum application performance with intelligent thread placement.”

Concurrent Kernel Offloading


“The combination of using a host cpu such as an Intel Xeon combined with a dedicated coprocessor such as the Intel Xeon Phi coprocessor has been shown in many cases to improve the performance of an application by significant amounts. When the datasets are large enough, it makes sense to offload as much of the workload as possible. But is this the case when the potential offload data sets are not as large?”

Numerical Optimization for Deep Learning


“With the advent of massively parallel computing coprocessors, numerical optimization for deep-learning disciplines is now possible. Complex real-time pattern recognition, for example, that can be used for self driving cars and augmented reality can be developed and high performance achieved with the use of specialized, highly tuned libraries. By just using the Message Passing Interface (MPI) API, very high performance can be attained on hundreds to thousands of Intel Xeon Phi processors.”

RCE Podcast: Jonathan Dursi on “HPC is dying, and MPI is killing it”

Jonathan Dursi,

In this RCE Podcast, Brock Palen and Jeff Squyres speak with Jonathan Dursi about his recent editorial entitled HPC is dying, and MPI is killing it. The article that spawned a lot of attention in good discussion for our community.

Interview: AutoTune – Automated Optimization and Tuning

Michael Gerndt, Technische Universität München

The main goal of AutoTune is the automatic optimization of applications in the area of HPC, targeting both performance optimization and energy efficiency. In this interview, Michael Gerndt from the Technische Universitaet Muenchen tells us more about the project.

Come to Portland! MPI 3.1 is Just Around the Corner

Jeff Squyres

Over at Cisco’s High Performance Computing Networking Blog, Jeff Squyres writes that MPI 3.1 is coming soon.