I’ve had this paper on my desk to read for nearly a year now. It’s from Oct 2008, and it’s called “Benchmarking Amazon EC2 for High Performance Scientific Computing” by Edward Walker. It’s an interesting, quick read that compares performance results from the NAS benchmarks on one of NCSA’s clusters and Amazon EC2; both clusters used dual-socket, quad-core 2.33-GHz Intel Xeon processors.
Instead, this article describes my results in using macro and micro benchmarks to examine the “delta” between clusters composed of currently available state-of-the-art CPUs from Amazon EC2 versus clusters available to the HPC scientific community circa 2008. My results were obtained by using the NAS Parallel Benchmarks to measure the performance of these clusters for frequently occurring scientific calculations. Also, since the Message-Passing Interface (MPI) library is an important programming tool used widely in scientific computing, my results demonstrate the MPI performance in these clusters by using the mpptest micro benchmark.
The results? EC2 always underperforms Abe, even when the work stays within a single node of both clusters. They are closest for the OpenMP versions of the NPB on a single node of each cluster, with results on EC2 lower by 7 to 21%. On MPI versions of the NPB run on 32 CPUs, the results for EC2 are between 40 and 1000% percent worse, and this is true even when the nodes don’t communicate
Figure 2 shows the run times of the benchmark programs. From the results, we see approximately 40%–1000% performance degradation in the EC2 runs compared to the NCSA runs. Greater then 200% performance degradation is seen in the programs CG, FT, IS, IU, and MG. Surprisingly, even EP (embarrassingly parallel), where no message-passing communication is performed during the computation and only a global reduction is performed at the end, exhibits approximately 50% performance degradation in the EC2 run.
The mpptest bisection results show that EC2′s interconnect performs about an order of magnitude worse than Abe’s InfiniBand network for both latency and bandwidth. From the paper’s conclusions
The opportunity of using commercial cloud computing services for HPC is compelling. It unburdens the large majority of computational scientists from maintaining permanent cluster fixtures, and it encourages free open-market competition, allowing researchers to pick the best service based on the price they are willing to pay. However, the delivery of HPC performance with commercial cloud computing services such as Amazon EC2 is not yet mature. This article has shown that a performance gap exists between performing HPC computations on a traditional scientific cluster and on an EC2 provisioned scientific cluster. This performance gap is seen not only in the MPI performance of distributed-memory parallel programs but also in the single compute node OpenMP performance for shared-memory parallel programs.
These are very much the same conclusions reached by another set of researchers in a paper published in February of this year using Linpack, “Can Cloud Computing Reach the Top500?”
In this paper we investigate the use of cloud computing for highperformance numerical applications. In particular, we assume unlimited monetary resources to answer the question, “How high can a cloud computing service get in the TOP500 list?” We show results for the Linpack benchmark on different allocations on Amazon EC2.
…While cloud computing provides an extensible and powerful computing environment for web services, our experiments indicate that the cloud (or Amazon’s EC2, at least) is not yet mature enough for HPC computations. We observe that the GFLOP/sec obtained per dollar spent decrease exponentially with increasing computing cores and correspondingly, the cost for solving a linear system increases exponentially with the problem size—very much in constrast to existing scalable HPC systems.
Suggestions for moving to higher levels of performance for our community?
If cloud computing vendors are serious about targeting the HPC market, different cost models must be explored. An obvious first step would be to offer better interconnects or nodes provisioned with more physical memory to overcome the slower network.
Of course, something to keep in mind with all of this is that it’s only useful to compare the performance of a cloud solution to a dedicated HPC solution when a user has such a choice. If there is no other alternative, EC2 is still infinitely better than nothing.