Today the National Supercomputing Center in Tianjin, China announced that their new Tianhe-1A supercomputer has set a new performance record of 2.507 petaflops on the LINPACK benchmark, making it the fastest system in the world today. While these results have been submitted to TOP500.org, their semi-annual list of the world’s fastest systems will not be released until SC10 somewhere around November 15, 2010.
The second-generation Tianhe (pronounced “tee-awn-hoo-wa”) system is named after the Milky Way galaxy. A heterogeneous computer, Tianhe-1A uses a proprietary interconnect to couple massively parallel GPUs with multi-core CPUs, enabling significant advantages in performance, size and power.
The system uses 7168 NVIDIA Tesla M2050 GPUs, 14,336 CPUs, and 262 Terabytes of distributed memory. With over 2 Petabytes of capacity, storage for the supercomputer is powered by the Lustre open source file system. To put the power of Tianhe in perspective, it would require more than 50,000 CPUs using three times as much power and twice as much floor space to deliver the same performance.
The performance and efficiency of Tianhe-1A was simply not possible without GPUs,” said President Guangming Liu of the National Supercomputer Center in Tianjin. “The scientific research that is now possible with a system of this scale is almost without limits; we could not be more pleased with the results.”
While the notion of the world’s fastest machine being in China may be alarming to some, it think it is important to note that the system will be made available to the international scientific community for a wide range of HPC fields, including drug discovery, hurricane and tsunami modeling, cancer research, car design, and even studying the formation of galaxies. This big supercomputer reflects China’s mission to not only lead the world in manufacturing capabilities as it does today, but to also lead in science, medicine and engineering.
China is very much a newcomer to the TOP500. They shocked the HPC community this past summer at ISC10 when their Nebulae system was ranked #2 in the world behind the #1 Jaguar Cray system at Oak Ridge National Laboratory. Now the Tianhe system has more performance than the #6 through #10 systems on the current TOP500 combined.
How was this dramatic jump possible? China aggressively adopted GPUs in Nebulae and Tianhe. Out of the 2.4 Petaflops benchmarked in the heterogeneous Tianhe system, more than 70 percent of the performance comes from GPUs.
The architectural focus on Nvidia GPUs also delivered tremendous power savings for the National Supercomputing Center, which is the Chinese equivalent of one of our U.S. National Labs. According to the press release, a 2.507 petaflop system built entirely with CPUs would consume more than 12 megawatts. Tianhe-1A consumes only 4.04 megawatts, making it 3 times more power efficient – the difference in power consumption is enough to provide electricity to over 5000 homes for a year.
Oddly enough, I think this announcement from a communist country heralds the democratization of parallel computing. The cost and power benefits of GPUs may be more dramatic at extreme scale, but the entry level of HPC is now within reach of thousands of more small and medium businesses.
At the high end, this news will almost certainly mark an inflexion point in the history of HPC, just as the Japanese Earth Simulator system did eight or nine years ago. Sure, you can always keep adding x86 cores to get more peak performance, but when you get to extreme scale, the power to run 50K conventional CPUs for one year can cost you more than the system itself. Like it or not, planners drawing up the supercomputer centers of tomorrow have to reckon with the fact that the power budget is now riding shotgun.