Sign up for our newsletter and get the latest HPC news and analysis.

Argonne takes more steps to manage energy use

I talked to Pete Beckman of the Argonne Leadership Computing Facility back in December about a really simple, but interesting, idea that he had implemented to save energy in the ALCF

Argonne National Labs is part of the Department of Energy, so it’s not exactly surprising to learn that they are actively looking for ways to reduce energy use. But using Chicago’s cold winters to save $25,000 a month on cooling costs for the supers in their Leadership Computing Facility is, well, cool.

As Pete explained at the time, this is an ongoing focus for them. This week the ALCF is back in the news with more energy saving tips for supercomputing centers

Varying the chilled water temperature to match the demand of the machine allows the chillers to use less energy. This effort involves mapping the optimum chilled water temperature to the machine load to determine the sweet spot for energy-efficient cooling.

“The trick,” said Jeff Sims, ALCF project manager, “is to find the warmest chilled water temperature you can live with at a given machine load, thus reducing the electric load on the chillers and maximizing the free cooling period. While the energy consumed to cool the machine has been reduced, there has been no impact on Intrepid’s performance.”

The next step will be to continue varying the temperature to see where performance (and presumably dependability) start to be impacted. I was at a meeting two supercomputings ago where someone (I think it was Donna Crawford) mentioned that they were doing the same thing in their datacenter. Other approaches being explored?

Other enhancements include using smart power management functions to turn off chips and storage systems when they are not in use, as well as scheduling intensive compute jobs to run at night when the power grid has more capacity and temperatures are lower.

Are you doing anything to save power at your facility? Drop a comment, and be sure to tell us wether you are doing it because you have to in order to keep the machine on (at the limits of your power distribution system), or whether you are doing it for the environment.

Comments

  1. I don’t know Pete personally, but I’ve heard him give several talks and he’s always really interesting. I’m not the least bit surprised he’s exploring ideas like that, and kudos to Argonne for running with it.

    At my institution, I don’t deal with the facility issues much but I did suggest a while back that it might be useful (in terms of dollars) to have the Provost’s Office kick in extra funds to any grant dollars aimed at cluster purchases to cover the difference from a ‘normal’ chip to a low-power version.

    The reason for this is that Joe Researcher doesn’t care whatsoever what the power costs are of his system because he doesn’t pay the power bill – he simply wants to get the most computational bang for his buck. However, funds for power and cooling DO come from the university itself, and machine room power is usually more limited than machine room space. So, in the end, if it costs the university $10K to buy the more expensive chips, but that saves $18K in power over three years (.. I’m tossing random numbers out..), it’s a net win, especially when you consider that you can pack more into a single machine room and don’t need a new one built.

    Of course, with so many other things to tackle this always seems to fall off the radar of the powers-that-be. Which, again, makes me even more impressed that Argonne ran with it. Nice job, guys.

    (PS. Yeah, I should probably look up power costs in our area to see how much of a win it might be.)

Resource Links: