Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Can FPGAs Help You?

FPGAs will become increasing important for organizations that have a wide range of applications that can benefit from performance increases. Rather than a brute force method to increasing performance in a data center by purchasing and maintaining racks of hardware and associated costs, FPGAs may be able to equal and exceed the performance of additional servers, while reducing costs as well.

SC16 to Showcase Latest Advances in HPC

SC16 returns to Salt Lake City on Nov. 13-18. The Six-day supercomputing event features internationally-known expert speakers, cutting-edge workshops and sessions, a non-stop student competition, the world’s largest supercomputing exhibition,panel discussions and much more. “No other annual event showcases the revolutionary advances and possibilities of high performance computing than the annual ACM/IEEE International Conference for High Performance Computing, Networking, Data Storage Analysis. From the impact of HPC on the future of medicine, to its transformative power in developing countries and “smart cities.” SC is the premiere venue for presenting leading-edge HPC research.”

Accelerating Finite Element Analysis with Intel Xeon Phi

With the introduction of the Intel Scalable System Framework, the Intel Xeon Phi processor can speed up Finite Element Analysis significantly. Using highly tuned math libraries such as the Intel Math Kernel Library (Intel MKL), FEA applications can execute math routines in parallel on the Intel Xeon Phi processor.

Video: SIMD, Vectorization, and Performance Tuning

James Reinders presented this talk at the 2016 Argonne Training Program on Extreme-Scale Computing. Reinders is the author of multiple books on parallel programming. His most recent book, entitled Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition, was co-authored by James Jeffers and Avinash Sodani.

Machine Learning and the Intel Xeon Phi Processor

“With up to 72 processing cores, the Intel Xeon Phi processor x200 can accelerate applications tremendously. Each core contains two Advanced Vector Extensions, which speeds up the floating point performance. This is important for machine learning applications which in many cases use the Fused Multiply-Add (FMA) instruction.”

Taming Heterogeneity in HPC – The DEEP-ER take

Norbert Eicker from the Jülich Supercomputing Centre presented this talk at the SAI Computing Conference in London. “The ultimate goal is to reduce the burden on the application developers. To this end DEEP/-ER provides a well-accustomed programming environment that saves application developers from some of the tedious and often costly code modernization work. Confining this work to code-annotation as proposed by DEEP/-ER is a major advancement.”

Intel Xeon Phi Coprocessor Design

“The major functionality of the Intel Xeon Phi coprocessor is a chip that does the heavy computation. The current version utilizes up to 16 channels of GDDR5 memory. An interesting notes is that up to 32 memory devices can be used, by using both sides of the motherboard to hold the memory. This doubles the effective memory availability as compared to more conventional designs.”

Intel Xeon Phi Coprocessor Architecture

“High performance systems now typically a host processor and a coprocessor. The role of the coprocessor is to provide the developer and the user the ability to significantly speed up simulations if the algorithm that is used can run with a high degree of parallelization and can take advantage of an SIMD architecture. The Intel Xeon Phi coprocessor is an example of a coprocessor that is used in many HPC systems today.”

Using Libraries in Offload Mode

The ability to develop applications independent of the hardware availability at run time is a very important concept that enables developers to take advantage of the latest and greatest processing and coprocessing power. Without having to make run time checks on hardware availability is critical to a smooth running HPC environment.

Offloading vs Native Execution on Intel Xeon Phi Coprocessors

“Native execution is good for application that are performing operations that map to parallelism either in threads or vectors. However, running natively on the coprocessor is not ideal when the application must do a lot of I/O or runs large parts of the application in a serial mode. Offloading has its own issues. Asynchronous allocation, copies, and the deallocation of data can be performed but it complex. Another challenge with offloading is that it requires memory blocking. Overall, it is important to understand the application, the workflow within the application and how to use the Intel Xeon Phi coprocessor most effectively.”