With the introduction of the Intel Scalable System Framework, the Intel Xeon Phi processor can speed up Finite Element Analysis significantly. Using highly tuned math libraries such as the Intel Math Kernel Library (Intel MKL), FEA applications can execute math routines in parallel on the Intel Xeon Phi processor.
James Reinders presented this talk at the 2016 Argonne Training Program on Extreme-Scale Computing. Reinders is the author of multiple books on parallel programming. His most recent book, entitled Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition, was co-authored by James Jeffers and Avinash Sodani.
“With up to 72 processing cores, the Intel Xeon Phi processor x200 can accelerate applications tremendously. Each core contains two Advanced Vector Extensions, which speeds up the floating point performance. This is important for machine learning applications which in many cases use the Fused Multiply-Add (FMA) instruction.”
Norbert Eicker from the Jülich Supercomputing Centre presented this talk at the SAI Computing Conference in London. “The ultimate goal is to reduce the burden on the application developers. To this end DEEP/-ER provides a well-accustomed programming environment that saves application developers from some of the tedious and often costly code modernization work. Confining this work to code-annotation as proposed by DEEP/-ER is a major advancement.”
“The major functionality of the Intel Xeon Phi coprocessor is a chip that does the heavy computation. The current version utilizes up to 16 channels of GDDR5 memory. An interesting notes is that up to 32 memory devices can be used, by using both sides of the motherboard to hold the memory. This doubles the effective memory availability as compared to more conventional designs.”
“High performance systems now typically a host processor and a coprocessor. The role of the coprocessor is to provide the developer and the user the ability to significantly speed up simulations if the algorithm that is used can run with a high degree of parallelization and can take advantage of an SIMD architecture. The Intel Xeon Phi coprocessor is an example of a coprocessor that is used in many HPC systems today.”
The ability to develop applications independent of the hardware availability at run time is a very important concept that enables developers to take advantage of the latest and greatest processing and coprocessing power. Without having to make run time checks on hardware availability is critical to a smooth running HPC environment.
“Native execution is good for application that are performing operations that map to parallelism either in threads or vectors. However, running natively on the coprocessor is not ideal when the application must do a lot of I/O or runs large parts of the application in a serial mode. Offloading has its own issues. Asynchronous allocation, copies, and the deallocation of data can be performed but it complex. Another challenge with offloading is that it requires memory blocking. Overall, it is important to understand the application, the workflow within the application and how to use the Intel Xeon Phi coprocessor most effectively.”
For decades, Intel has been enabling insight and discovery through its technologies and contributions to parallel computing and High Performance Computing (HPC). Central to the company’s most recent work in HPC is a new design philosophy for clusters and supercomputers called Intel® Scalable System Framework (Intel® SSF), an approach designed to enable sustained, balanced performance as the community pushes towards the Exascale era.
In this video from ISC 2016, Steve Branton from Asetek describes the company’s innovative liquid cooling solutions for HPC. “Because liquid is 4,000 times better at storing and transferring heat than air, Asetek’s solutions provide immediate and measurable benefits to large and small data centers alike. RackCDU D2C is a “free cooling” solution that captures between 60% and 80% of server heat, reducing data center cooling cost by over 50% and allowing 2.5x-5x increases in data center server density. D2C removes heat from CPUs, GPUs, memory modules within servers using water as hot as 40°C (105°F), eliminating the need for chilling to cool these components.”