Integrating an Intel Xeon Phi Cluster

Print Friendly, PDF & Email

Cray_Computer_Cluster_05_HRSSWhile the pace of innovation in terms of CPUs and coprocessors continues, the fact is that the systems must still be installed in a traditional rack complete with cabling and networking and storage technologies. Once the hardware is installed, the real work begins on how to get this rack full of silicon actually working and producing results for the users.

At the National Institute of Computational Sciences (NICS), a joint venture by the University of Tennessee and the Oak Ridge National Laboratory, a joint team set out to learn how to integrate the Intel Xeon Phi coprocessor into cluster configurations.

There are a number of steps to insure that a cluster is ready for production.  A few of the more important processes, at a high level:

  • Preparation of the System – Install the OS and the OFED stack.
  • Installation of the Intel MPSS Stack – download the Intel MPSS software and save into an NFS share. As of MPSS version 3.2.3, a stock Red Hat or CentOS kernel is assumed.
  • Create configuration files – Use the Intel provided command, micctrl to do this and to manage the coprocessors.
  • Setup Resource and Workload Managers – in order to insure smooth and efficient operation of the cluster, it is important to use DRMs.
  • User Software Integration – It is necessary to make sure that all applications have been compiled and tuned for the Intel Xeon Phi coprocessors.

While the advantages of using the Intel Xeon Phi coprocessors is clear, the integration into a cluster environment adds some complexity. Additional burdens are related to system setup, system administration, and application management. By learning from other installations, a best practices methodology can be implemented, which leads to faster startup and lowers the barriers to entry.

Source: University of Tennessee, Knoxville, NICS, Intel Corporation, USA

Transform Your Code

Deliver top application performance and reliability with Intel Parallel Studio XE: A C++ and Fortran tool suite that simplifies the development, debug, and tuning of code. Compatible with leading compilers.


  1. yeah. I’m in the middle of the OFED+MPSS installation now, my cluster is
    working as separate machines already. Actually, I’ve built two for the price of 1/2.

    Cluster art is 16 nodes of: i7-5820K OCd to 4GHz + Phi (31S1P) + Nvidia GTX 970. [mobos are Asrock X99M WS, I needed a special BIOS for them to work with Phi’s, “above 4GB decoding”), mem is only 8GB 2800 DDR4 – don’t need more; and I have DDR Infiniband controllers] – cost is low, slightly more than $2k per node.

    Cluster sci-phi is 6 nodes of: cpu+4*Phi (31S1P).
    mobo is P77Z WS, cpu is i7-2600 with inbuilt VGA graphics (so I can have 4 x8 PCIe slos empty). Cost is extra low, not much more than $1k per node.
    My Phi’s, from the Intel sale of Tianhe2 parts surplus, costed me $125 ea. ,
    and are calculating CFD on 400x400x400 grids faster than the overclocked 6-core Haswells I have in art cluster, so the project definitely will reach its design goal of being able to do quick 1000x1000x1000 astro fluid simulations. Nice!
    The step I’m now in is the most challenging for me as an astrophysicist, but hopefully doable. You need to read a lot to figure out what configuration of the cluster you want and what is feasible.