Over at OLCF, Jeff Gary writes that when the Oak Ridge Leadership Computing Facility (OLCF) replaced its Jaguar supercomputer with Titan, not only did it expand its computing speed tenfold, it also saved on the electric bill.
In short, Titan is using roughly the same amount of electricity to run its computations as did Jaguar, but at up to tenfold the performance. Where the new machine gathers its extra bang for the buck, Rogers said, is through the configuration of the compute nodes. Each of Jaguar’s Cray XT5 compute nodes featured two 12-core AMD Opteron processors whereas each compute node in Titan pairs one 16-core AMD Opteron processor with one NVIDIA Kepler GPU. This hybrid architecture is not only computationally faster, but also much more energy efficient.
The NVIDIA Kepler GPU has very sophisticated power management features,” Rogers explained. “Each GPU can identify new work, schedule it, and change its power-state accordingly. The GPU is ready to go in an instant, ramping both processor frequency and power budget. However, when it’s idle, it almost powers itself down—consuming less than 20 watts. The beautiful thing is it can switch between those states seamlessly based on workload. The net effect is that we actually use less energy to get a lot more work done. In fact, Rogers’s team is tracking statistics to show just how efficient the GPUs are at performing their work. GPU-enabled applications can typically reduce their run time for the same problem by more than 50 percent. We’ve seen some applications where the speedup is as much as seven times the nonaccelerated case. We’re operating more compute cores, with up to 10 times thcapability, in the same or smaller power profile and the same footprint. The hybrid architecture in Titan, with its very efficient GPUs, is the dominant driver for energy consumption and management.”
This timelapse video from OLCF shows the Jaguar upgrade to Titan.
Sign up for our insideHPC Newsletter.