A Closer Look at Memory Bandwidth and GPU Performance

David Kanter over at the Real World Technologies blog takes a closer look at memory bandwidth and GPU performance:

“In some cases, the GPU with the lower GFLOP/s actually delivers the best performance – which is totally counter-intuitive. One pair of points that perfectly illustrates counter-intuitive behavior is the first two AMD GPUs. The shader arrays provide 432 and 422 GFLOP/s respectively, but the first card only scores 2552 on 3DMark, while the latter scores a significantly higher 3463. One card has ~2% less shader compute, but 36% higher performance. This behavior is hardly isolated to AMD cards either. Three Nvidia GPUs have 192 GFLOP/s throughput in their shader arrays. Two of these cards score 3700 and 3374, while the third is a disappointing 2527. Despite having the same theoretical throughput, one of the cards is 46% faster than another.”

