Clouds in current form not a solution for HPC

Print Friendly, PDF & Email

Doug Eadline’s article last week at Linux Magazine was about the difference between grids and clouds, and the ways in which clouds are not (currently) suited to (most) HPC workloads

Grid works at the “library level” where the users are provided an known “environment” in which their applications should be able to run. Because grids are “open”, the variations in end points can be significant and small details mattered. i.e. ensuring consistent software libraries across hardware domains can present problems. Other aspects such as administrative domains, security, and data transfer are part of the grid environment as well.

Enter the cloud. In a sense cloud computing offers what grid cannot, a predictable execution environment. Thanks to virtualization, the exact execution environment can be created and cloned in the cloud. Grid attempts to link geographically distributed hardware with unique execution environments. Obviously attempts are made to create uniform execution environments, but a large part of grid software is devoted to publishing these environments so that domains may interact. A cloud, on the other hand, is an expandable hardware platform on which a virtual environment is created on-demand.

But, Eadline argues, in using virtualization technologies to solve the execution environment problem, the close coupling between applications and hardware was lost  to the detriment of performance, or at least to the detriment of predictable performance

Where grids paid attention to certain HPC performance guarantees, clouds, in order to be easy to use, have declined such guarantees. In particular, HPC requires a predictable and guaranteed level of I/O — both for storage and compute traffic. Unless a cloud has been specifically designed for HPC, the user cannot expect consistent and/or high performance. There are two papers which discuss this very idea. The first paper looks at Benchmarking Amazon EC2 for High-performance Scientific Computing and the second paper asks, Can Cloud Computing Reach The TOP500?

This is an interesting and important problem with using clouds in their current form for every HPC job. As he points out however, there are still applications currently run on HPC platforms today that are more flexible with respect to I/O requirements that are good candidates for cloud computing. I’m not sure that the Amazon mode for cloud computing will replace traditional HPC infrastructure, but I do still believe that some form of hosted HPC will become an increasingly attractive option for organizations who want the capabilities of HPC without the messiness of running their own machines.

And I know something about that messiness — we literally had to get an act of Congress to grow our power infrastructure and raised floor space for our DoD computing center. That was a multi-year proces. We simply cannot afford that kind of inflexibility in the long run.