Optimizing the HPCG Benchmark on GPUs

October 24, 2014 by Doug Black

Everett Phillips

Over at the Parallel for All Blog, Everett Phillips and Massimiliano Fatica write that GPUs offer good acceleration on the new HPCG benchmark that has been designed to augment Linpack as a measure of performance for the TOP500. Their GPU porting strategy focused on parallelizing the Symmetric Gauss-Seidel smoother (SYMGS), which accounts for approximately two thirds of the benchmark flops.

GPU-accelerated supercomputers have proven to be very effective for accelerating compute-intensive applications like HPL, especially in terms of power efficiency. Obtaining good acceleration on the GPU for the HPCG benchmark is more challenging due to the limited parallelism and memory access patterns of the computational kernels involved. In this post we present the steps taken to obtain high performance of the HPCG benchmark on GPU-accelerated clusters, and demonstrate that our GPU-accelerated HPCG results are the fastest per-processor results reported to date.

The first HPCG list was published at ISC14 and included 15 supercomputers. Instead of looking at the peak flops of these machines, we evaluate the efficiency based on the ratio of the HPCG result to the memory bandwidth of the processors. The following table shows the results of the top 4 systems that submitted optimized results.

HPCG RANK	MACHINE NAME	HPCG GFLOP/S	#PROCS	PROCESSOR TYPE	HPCG PER PROC	BANDWIDTH PER PROC	EFFICIENCY (FLOPS/BYTE)
1	Tianhe-2	580,109	46,080	Xeon Phi-31S1P	12.59 GF	320 GB/s	0.039
2	K	426,972	82,944	Sparc64-viiifx	5.15 GF	64 GB/s	0.080
3	Titan	322,321	18,648	Tesla-K20X+ECC	17.28 GF	250 GB/s	0.069
5	Piz Daint	98,979	5,208	Tesla-K20X+ECC	19.01 GF	250 GB/s	0.076

If you’d like to learn more this work on HPCG, be sure to attend Everett Phillips’ talk in the NVIDIA Booth #1727 at Supercomputing 2014 on Tuesday, November 18 at 10:30am.

Read the Full Story.

Sign up for our insideHPC Newsletter.

Optimizing the HPCG Benchmark on GPUs

Sponsored Guest Articles

Dell: Omnia Copes with Configuring HPC-AI Environments

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA

Optimizing the HPCG Benchmark on GPUs

Sponsored Guest Articles

Dell: Omnia Copes with Configuring HPC-AI Environments

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Related Posts

Featured RSS Feed

More News from insideBIGDATA