Applications that use 3D Finite Difference (3DFD) calculations are numerically intensive and can be optimized quite heavily to take advantage of accelerators that are available in today’s systems. The performance of an implementation can and should be optimized using numerical stencils. Choices made when designing and implementing algorithms can affect the Arithmetic Intensity (AI), which is a measure of how efficient an implementation, by comparing the flops and memory access.
“OpenCL is a fairly new programming model that is designed to help programmers get the most out of a variety of processing elements in heterogeneous environments. Many benchmarks that are available have demonstrated that excellent performance can be obtained over a wide variety of devices. Rather than lock an application into one specific accelerator, by using OpenCL, applications can be run over on a number of different architectures with each showing excellent speedups over a native (host cpu) implementation.”
Today Cray announced that the Danish Meteorological Institute (DMI) has purchased a Cray XC supercomputer and a Cray Sonexion 2000 storage system. Through an arrangement with the Icelandic Meteorological Office (IMO), the system will be installed at the IMO datacenter in Reykjavik, Iceland for year-round power and cooling efficiency.
Today Colfax International announced free online workshops on parallel programming and optimization for Intel architecture, including Intel Xeon processors and Intel Xeon Phi coprocessors. “The Hands-on Workshop (HOW) series will introduce best practices to researchers and developers to efficiently extract maximum performance out of modern parallel processors, achieving shorter time to solution, higher research productivity, and future-proof design.”
The Embree kernel approach, using the Intel Xeon Phi coprocessor is applicable to many situations. The implementation can be tuned to the hardware available, using different vector widths and workloads per ray. With a flexible toolkit for rendering, applications can take advantage of the latest hardware acceleration to achieve maximum performance.
“An expanding area of work both on the hardware front and the software side is to modify and optimize applications to run on both the host processor and a coprocessor. Many techniques to transform applications to reduce runtime have been discussed and implemented across a wide variety of applications.”