QCD Optimization on Intel Xeon Phi

Print Friendly, PDF & Email

qcdQuantum chromodynamics is the theory of the strong nuclear force. The applications using theis theory are used heavily in the study of nuclear interactions. A significant percent of the total time on US Department of Energy systems at the National Energy Research Scientific Center (NERSC)  computer systems are running QCD calculations. Lattice QCD calculations can be used for some Grand Challenge problems in HPC as well as the more well known areas such as nuclear and high energy physics.

Many optimizations can be performed on an application that is QCD based and can take advantage of the Intel Xeon Phi coprocessor as well. With pre-fetching, SMT threading and other optimizations as well as using the Intel Xeon Phi coprocessor, the performance gains were quite significant. An initial test, using single precision on the base Sandy Bridge systems, the test case was showing about 128 Gflops. However, using the Intel Xeon Phi coprocessor, the performance jumped to over 320 Gflops.

A number of lessons can be learned when dealing with a complex code such as a QCD one. It is important to understand if the application is memory or compute bound. Other considerations  and optimizations should be considered, such as vectorization, pre-fetching, block and synchronization.  It is also important to experiment with various vector lengths to determine the most performant.

Source: US Department of Energy, USA; Intel, USA; Intel, India

Transform data into opportunity. Speed data analysis in your applications.  Get Intel® DAAL