Amazon adds support for traditional HPC workloads with Cluster Compute instance

July 13, 2010 by Doug Black

Today Amazon CTO Werner Vogels announced on his blog that Amazon EC2 has added what it is calling Cluster Compute instances specifically to support the kinds of closely coupled workloads that traditional HPC users often run. This is an important step in growing the relevance of EC2 resources to high performance computing given the (unsurprising) benchmark results that have indicated that Amazon’s traditional highly virtualized servers underperform on these types of applications (lots of writing on this, but see here and here for examples). Vogels acknowledges this in his post

As much as Amazon EC2 and Elastic Map Reduce have been successful in freeing some HPC customers with highly parallelized workloads from the typical challenges of HPC infrastructure in capital investment and the associated heavy operation lifting, there were several classes of HPC workloads for which the existing instance types of Amazon EC2 have not been the right solution. In particular this has been true for applications based on algorithms – often MPI-based – that depend on frequent low-latency communication and/or require significant cross sectional bandwidth. Additionally, many high-end HPC applications take advantage of knowing their in-house hardware platforms to achieve major speedup by exploiting the specific processor architecture. There has been no easy way for developers to do this in Amazon EC2… until today.

The new offering gives users the ability to get at higher performance networks and to specify exactly the hardware they need to run on (though as far as I can tell your networking options don’t include IB)

Cluster Computer Instances are similar to other Amazon EC2 instances but have been specifically engineered to provide high performance compute and networking. Cluster Compute Instances can be grouped as cluster using a “cluster placement group” to indicate that these are instances that require low-latency, high bandwidth communication. When instances are placed in a cluster they have access to low latency, non-blocking 10 Gbps networking when communicating the other instances in the cluster.

Next, Cluster Compute Instances are specified down to the processor type so developers can squeeze optimal performance out of them using compiler architecture-specific optimizations. At launch Cluster Computer Instances for Amazon EC2 will have 2 Intel Xeon X5570 (also known as quad core i7 or Nehalem) processors.

Amazon has also issued an official press release about the new offering. NERSC has been among those exploring the use of EC2 resources for scientific computing as we reported earlier this summer, and they’ve seen positive results

“Many of our scientific research areas require high-throughput, low-latency, interconnected systems where applications can quickly communicate with each other, so we were happy to collaborate with Amazon Web Services to test drive our HPC applications on Cluster Compute Instances for Amazon EC2,” said Keith Jackson, a computer scientist at the Lawrence Berkeley National Lab. “In our series of comprehensive benchmark tests, we found our HPC applications ran 8.5 times faster on Cluster Compute Instances for Amazon EC2 than the previous EC2 instance types.”

Since NERSC was reporting slowdowns of “over a factor of 10” (quote from Kathy Yelick in that NERSC story linked above), this puts Amazon notionally within striking distance of what you could do with your own cluster. When you factor in things like not having to have your own admins, floor space, and power and cooling, you get to an equation that starts to look like its worth seriously investigating.

There is only a single offering in the Cluster Compute product line right now; here are the specs according to Amazon’s product page

23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge

Oddly, there is a default usage limit of 8 instances (64 cores), but the web page says if you need more you can send them an email.

The press release includes a Linpack performance measurement

“For perspective, in one of our pre-production tests, an 880 server sub-cluster achieved 41.82 TFlops on a LINPACK test run – we’re very excited that Amazon EC2 customers now have access to this type of HPC performance with the low per-hour pricing, elasticity, and functionality they have come to expect from Amazon EC2.” (Peter De Santis, General Manager of Amazon EC2)

Assuming 2.93GHz processors, thats an Rmax of 41.82 TFLOPS on an Rpeak of 82.51 TFLOPS, or about 51% efficiency. For comparison, system number 162 on the Top500 is a 6400 core GigE connected Xeon 5570 (2.93 GHz) system that achieves 39.77 TFLOPS (Rpeak 75.01 TFLOPS) at an efficiency of 53%.

Comments

FJW says

July 13, 2010 at 9:36 pm

Unless you are an academic, a Government drone or possibly an organization that only has a limited number of in house apps this will go the same route as the other “grids/clouds.” The folks with the real $$$ to spend need commercial apps – EDA, CAE, Financial.

Until someone breaks the stranglehold of the commercial ISVs (which do NOT like grids – dilutes the license maintenance pool) and/or standardize the license/revenue model this is a non-starter. Now if this was bundled with software as a service included in the per CPU-hour… then maybe…

Love to see a panel session one of these days with ANSYS, LSTC, Simulia, MetaComp, CD-Adapco etc, and pepper these guys with questions on the cloud and the grid for enterprise customers ala Boeing, Big 3 (er 2 – Fiat doesn’t count), P&G, etc.
Chris Samuel says

July 14, 2010 at 4:51 am

Interconnect is just 10GigE, I’d want to see some MPI latency numbers before describing it as low latency – HPL efficiency doesn’t seem that hot.

Still, could be useful to some people!
John West says

July 14, 2010 at 10:02 am

FJW – YES! Great point…the commercial software licensing model has really not caught up yet, although some progress is being made in point solutions here and there.
John West says

July 14, 2010 at 10:03 am

Chris – how about lowER latency? At least they are making some progress.
FJW says

July 14, 2010 at 11:17 am

John – Agreed that there are starting to be pockets of resistance to fight and bring the ISVs in line. What we need is something akin to Openfoam (www.openfoam.com) to scare the CSM folks.

I’m still smarting that the Gov Drones didn’t force MSC to divest a copy of the NASTRAN source BACK to the government when they were declared a monopoly. Alas… to think where we would be if all ISVs had their own “Linux” to counterpoint their “Microsoft”.

Amazon adds support for traditional HPC workloads with Cluster Compute instance

Trackbacks

Sponsored Guest Articles

‘Glow-in-the-Dark’ GPUs, Holes Burnt in Boards, Overprovisioning Systems ‘Until Funding Runs Out’ and Other Factors Calling for Optical I/O

White Papers

Energy efficiency drives HPC to the cloud

Comments

Featured RSS Feed

More News from insideBIGDATA

Amazon adds support for traditional HPC workloads with Cluster Compute instance

Trackbacks

Sponsored Guest Articles

‘Glow-in-the-Dark’ GPUs, Holes Burnt in Boards, Overprovisioning Systems ‘Until Funding Runs Out’ and Other Factors Calling for Optical I/O

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Comments

Related Posts

Featured RSS Feed

More News from insideBIGDATA