Megware to Install CooLMUC 3 Supercomputer at LRZ in Germany

Print Friendly, PDF & Email

Axel Auweter is Head of HPC Development at MEGWARE

The Leibniz Supercomputing Center (LRZ) has announced plans to acquire the third generation of the CooLMUC computer cluster at its Garching-based research center. This is the second time that MEGWARE has supplied a CooLMUC system, systems that stand out for their ultra-high energy efficiency.

The new CooLMUC 3 system outperforms its predecessors in a number of respects, including its key feature of cooling in thermally insulated racks all compute and login nodes, power supply units, and Omni-Path switches directly with hot water, a combination the likes of which has never been seen before,” said Axel Auweter, Head of HPC Development at MEGWARE. “Even at a cooling water temperature of 40 degrees Celsius and a room temperature of 25 degrees Celsius, a maximum of just 3% waste heat is produced in the ambient air. We’ve carried out a great deal of development work and are now more ready than ever to supply highly efficient, environmentally-friendly HPC technologies globally. Moreover, with CooLMUC 3, it is not only compute nodes that are cooled directly with standard processors, but also the latest generation of Intel Xeon Phi, which stands out for an integrated fabric and is directly interconnected via the Intel Omni-Path high-performance network.”

The hot-water-cooled system was developed at the MEGWARE’s technology development center in Chemnitz and based on its own hardware and software, thereby substantiating MEGWARE’s claim to be a global leader in highly energy-efficient high-performance computers. With the first generation of CooLMUC, MEGWARE proved back in 2011 that it is possible to cool various processor technologies directly with hot water. In addition, an absorption chiller was used to efficiently reuse residual heat to generate process cooling, in order to cool existing servers in additional racks.

In the first development stage of CooLMUC 3, the total 148 nodes in the newly developed MEGWARE SlideSX-LC chassis are equipped with Intel Knights Landing (KNL) Xeon Phi 7210-F CPU – 64 cores/16 GB MCDRAM, 96 GB of main memory (6x 16 GB DDR4-2133), and 2x 100 Gbit/s of Omni-Path. The Intel KNL processors can boot themselves, which means that they do not require a separate host processor and have unhindered access to the main memory.

In the new version of the SlideSX-LC chassis, 10 compute nodes on 6 rack units are 100% directly water cooled, using a central power supply and up to 5 power supply units. The compute nodes are connected within the Omni-Path fabric via the HFI integrated in the CPU with 2 x 100 Gbit/s ports, and the entire fabric is interconnected with a blocking factor of 1:2. The bandwidth within groups of 32 nodes each (per 48 port edge switches) is non-blocking with 200 Gbit/s and amounts to 100 Gbit/s per port across all 148 nodes. In recent years, MEGWARE has continued to further develop ColdCon technology, which it has successfully installed at various computing centers in Europe. At the research center in Garching, the new rack cooling unit – the new hydraulic unit – is now to be integrated in each rack and will provide for the more efficient separation of water using heat from the IT to heat the building, as well as a more accurate temperature and heat quantity measurement.

The system will be protected by the ClustSafe PDUs developed by MEGWARE – each with 12 ports and 10A of output, with a calibrated power measurement per port – and monitored using the ClustWare management software developed by MEGWARE.

The new computer will cost approximately €1 million, and is being financed by the German Research Foundation (DFG) and the Free State of Bavaria. The computing power will be available to academic users throughout Bavaria.

Sign up for our insideHPC Newsletter