While a coprocessor might advertise a certain number of cores for HPC processing, in practice such a device might still have cores that are used to run an operating system or other software which is needed for a heterogeneous architecture. In the case of the Intel Xeon Phi coprocessor, although 60 cores are commonly used for computation, there is another core that is available, but not traditionally used as part of a simulation.
Experiments using the 61st core for actual computation while running a reverse Monte Carlo ray tracing application for the modeling of radiative heat transfer, demonstrated that the use of another core improved performance, and that oversubscribing the coprocessor operating system thread did not degrade the performance. Experiments were performed using a dual socket system with each having 16 cores. The Intel Xeon Phi coprocessor contained 61 cores and 16 GB of memory.
Various affinity patterns can be set up to determine the best arrangement for this application. The patterns that were investigated were:
- Compact – bind the execution threads incrementally across logical cores and then physical cores.
- None – allow threads to run anywhere.
- Scatter – bind the threads first over the physical cores and then the logical cores.
- Selective – bind the control thread to the 61st physical core.
Results using the different affinity patterns when running large scale simulations showed that the performance was basically the same with minimal variability. The thread placement strategy did not have a great effect on the performance for this type of application. However, the use of the 61st core did improve performance slightly, suggesting that this is an area for further investigation. One note is that the simulations were run natively on the Intel Xeon Phi coprocessor. When operating in the offload mode, Intel guidance is not to use the 61st for the actual simulations.
Source: SCI Institute, Utah, USA