Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Volkov and Demmel Paper on GPUs Wins SC19 Test of Time Award

Today SC19 announced the winners of the Test of Time Award (ToTA). The annual award recognizes an outstanding paper that has deeply influenced the HPC discipline. It is a mark of historical impact and recognition that the paper has changed HPC trends.

We are pleased to announce the selection of the SC08 paper, Benchmarking GPUs to Tune Dense Linear Algebra, by Vasily Volkov (NVIDIA) and James Demmel (UC Berkeley) as the SC19 ToTA winner.

The paper was deemed deserving of the SC19 ToTA due to its first-of-its-kind vision of GPU architectures as a vector machine. By building on this vision, Volkov and Demmel defined techniques to achieve greater efficiency and performance. Since 2008, this vision has resulted in a way of thinking and modeling algorithms on GPUs that still guide others to pursue and obtain high performance across the larger and larger GPU community. Written while both authors were at UC Berkeley, the paper has had a tremendous impact with nearly 1,000 citations, and it is still being cited more than a decade after its publication.

Abstract:

We present performance results for dense linear algebra using recent NVIDIA GPUs. Our matrix-matrix multiply routine (GEMM) runs up to 60% faster than the vendor’s implementation and approaches the peak of hardware capabilities. Our LU, QR and Cholesky factorizations achieve up to 80–90% of the peak GEMM rate. Our parallel LU running on two GPUs achieves up to ~540 Gflop/s. These results are accomplished by challenging the accepted view of the GPU architecture and programming guidelines. We argue that modern GPUs should be viewed as multithreaded multicore vector units. We exploit blocking similarly to vector computers and heterogeneity of the system by computing both on GPU and CPU. This study includes detailed benchmarking of the GPU memory system that reveals sizes and latencies of caches and TLB. We present a couple of algorithmic optimizations aimed at increasing parallelism and regularity in the problem that provide us with slightly higher performance.

Meet the Authors at SC19

When informed of being the recipients of the SC19 ToTA, Volkov and Demmel were delighted and honored to have their paper recognized with one of the most prestigious SC19 awards.

This was our attempt to understand how GPUs work and how to efficiently program them. Our resulting algorithms ran significantly faster than the vendor’s code, and near machine peak,” said Demmel. He added, “We look forward to speaking about this paper to describe how this effort on GPU performance optimization has evolved.”

Volkov and Demmel will reveal their thoughts, share their challenges, and discuss the impact of their work in an invited talk at SC19 on Tuesday, November 19, 3:30 pm–4:15 pm, in the Mile High Ballroom.

Download the paper (PDF)

Registration is now open for SC19, which takes place Nov. 17-22 in Denver.

Check out our insideHPC Events Calendar

Leave a Comment

*

Resource Links: