Doug Eadline posted a quick primer on HPC benchmarking over at HPCCommunity.org a while ago that thought was helpful
Benchmarking any computer is always useful, but it is not quite as simple as running a few programs and reporting some numbers. Indeed, benchmarking is an art that requires some diligence and attention to detail. To be effective, one must have clear goals and objectives in mind. In addition, interpretation of results must be done within the context of the benchmark. In this article will take a look at this process and how benchmarks can be used to aid system administration without trying to win contests.
Something that I found useful in the article is the categorization of the reasons that people run benchmarks into performance optimization (including press release benchmarks), baseline performance, and burn-in.
Baseline Performance – Baseline benchmarking is perhaps the most important type of benchmark a cluster administrator can perform. The concept is very simple: run a set of benchmarks on the current cluster configuration and tuck them away. After an upgrade (hardware or software) is performed, re-run the benchmarks and see if things are better, worse, or the same. Upgrades do not automatically mean better performance. For instance, if you perform an OS upgrade, re-running your benchmark suite is important. And, like the previous runs, keep them for future reference. Of course, if you just upgrade an MPI library, re-run only the tests that use MPI.