Immersion Cooling Steps Up for HPC Clusters

Robert Roe

Robert Roe

Over at Scientific Computing World, Robert Roe writes that immersion cooling is getting traction in the HPC market.

The race is hotting up for immersive cooling in HPC, with two major announcements concerning submerged HPC clusters. A consortium of Austrian research organizations has ordered a high performance cluster from ClusterVision, that will use Green Revolution’s mineral oil submerged cooling solution. At the same time, 3M announced a “proof of concept” submerged HPC cluster, in collaboration with Intel and SGI, that uses two-phase immersion cooling technology.

The ClusterVision machine will be used by Austrian research organizations who are collaborating on the Vienna Scientific Cluster (VSC-3) project. ClusterVision joined forces with Green Revolution Cooling to design the first ever skinless supercomputer, by removing the chassis and unnecessary metal parts that would obstruct oil flow and add to the initial investment required for the server.

According to ClusterVision, the solution almost eliminates power consumption for air-cooling. It is reduced to just 5 per cent of a conventional system’s consumption, which in turn cuts total energy consumption at the data centre in half. This approach can also reduce capital expenditure due to the elimination of expensive cooling equipment typically used in conventional data centers, such as chillers and HVAC units, and by omitting the need for architectural modifications such as raised flooring and airflow aisles. Another area of saving is reduced current leakage at the processor level in the submerged solution, resulting in less wasted server power.

Power efficiency in high-performance computing is of growing concern, due to technological challenges in the ongoing race to exascale and, far more importantly, growing concerns on climate change. With this reference, we set the stage for a new paradigm in information technology,” stated Dr Alex Ninaber, technical director, ClusterVision.

VSCThe VSC-3 cluster is designed to balance compute power, memory bandwidth, and the ability to manage highly parallel workloads. As primary contractor for the design and build of the VSC-3 cluster, ClusterVision is bringing together technology from several HPC partners, including Supermicro, Intel, Bright Computing, and the Fraunhofer Institute´s parallel file system BeeGFS (formerly named FhGFS).

The VSC-3 configuration consists of 2020 nodes based on Supermicro’s motherboard, each fitted with 2 eight-core Intel Xeon E5-2650 v2 processors running at 2.6GHz. The smaller compute nodes (Supermicro’s X9DRD-iF motherboard) have 64 GB of main memory per node, whilst the larger nodes have up to 128 and 256 GB of main memory.

The interconnect system is based on Intel’s Truescale QDR80 design. “The True Scale QDR 80 design provides an architecture that will allow the users to benefit from high message rates, deterministic latency, resiliency and scalability across the whole of the cluster,’ stated Ian Wardrope, Sales Director, HPC & Fabrics at Intel.

The software components of the cluster include BeeGFS (formerly known as FhGFS), the parallel file-system with 0.64 PB storage and over 20 GBps bandwidth, from the Fraunhofer Institute for Technological and Industrial Mathematics (ITWM). All the hardware and software in the VSC-3 cluster is provisioned and managed using Bright Cluster Manager from Bright Computing.

3M’s proof-of-concept system was put together in collaboration with SGI and Intel. It is also submerged in liquid to provide increased cooling and power efficiency but uses many different components from the ClusterVision concept.

coolingSGI’s ICE X, the fifth generation of SGI’s distributed memory supercomputer, and the Intel Xeon processor E5-2600 hardware, were placed directly into 3M’s Novec Engineered Fluid. The fluid is an efficient dielectric that keeps the hardware cooled with minimum additional energy.

According to 3M, the two-phase immersion cooling can reduce cooling energy costs by 95 per cent, and reduces water consumption by eliminating municipal water usage for evaporative cooling. Heat can also be harvested from the system and reused for heating and other process technologies such as desalination of sea water.

This technique has been shown to require up to 10 times less space than conventional air cooling while eliminating expensive air cooling infrastructure, making it cost effective for large-scale data centre hubs. Immersive cooling allows tighter component packaging – allowing for greater computing power in less space – and easy access to hardware. In fact, it is claimed that the system can enable up to 100 kilowatts of computing power per square meter.

We are thrilled with the work that our collaboration with SGI and Intel has produced,” said Joe Koch, business director for 3M Electronics markets materials division. “We applaud them for their leadership in helping us find better ways to address energy efficiency, space constraints and increased computing power in data centers. These advancements are a significant stepping stone in accelerating industry-wide collaboration to optimize computer hardware design.”

“Through this collaboration with Intel and 3M, we are able to demonstrate a proof-of-concept showcasing an extremely innovative capability to reduce energy use in data centers, while optimizing performance,” said Jorge Titinger, president and CEO of SGI. “Built entirely on industry-standard hardware and software components, the SGI ICE X solution enables significant decreases in energy requirements for customers, lowering total cost of ownership and impact on the environment.”

By investing in advanced cooling technologies, companies such as Intel and SGI can explore hardware designs without the heat transfer constraints of traditional cooling, while being more affordable and less complex to build and operate.

In-depth data acquisition and evaluation of the 3M installation will begin in April. Additionally, the companies are working with the Naval Research Laboratory, Lawrence Berkeley National Labs and APC by Schneider Electric to deploy and evaluate an identical system with the intention to demonstrate the viability of the technology at any scale.

This story appears here as part of a cross-publishing agreement with Scientific Computing World.

Resource Links: