Some of you may be interested in this article I wrote for HPCwire last week about benchmarking the performance of cloud — and particularly scientific computing — performance
It was inevitable that with all the hype and marketing dollars directed at cloud computing these days that someone would eventually start trying to use them for real work. Of course, this puts a nasty wrinkle into marketing plans because once people starting using them for real work, then there are actual performance results. The results themselves aren’t too troubling because they are usually point cases, and negative messages are easily explained away by calling on the vagaries of a particular software stack and the giving away of snazzy memory sticks. But then the results lead the engineering-minded to wonder whether all of the available cloud computing alternatives behave in the same way, and if not which of the them might be best suited for a particular task. This leads to standardized testing and then, before you know it, we have full-fledged benchmarking on our hands.
…So, while traditional approaches to benchmarking, and traditional benchmarks for that matter, will provide some useful information about the performance of clouds, the traditional testing philosophy behind most benchmarks today doesn’t lend itself to creating a test of merit that enables comparison of two clouds with one another in a way that takes into account the very features that make them interesting technology solutions for certain classes of problems in the first place.
In the article I talk about some of the recent work on benchmarking efforts for clouds that are designed to take into account the unique features of clouds as a service, and I highlight some interesting results from the Open Cloud Consortium on a new benchmark called MalStone aimed specifically at clouds that are used for large scale data manipulation (where by “large” I mean up to 100TB of data).
Something that I don’t talk about in the article is the work that Grossman’s team has done on the software that generates the test data used by benchmark runs, MalGen. This is what project lead Bob Grossman had to say about MalGen
The MalStone Project (on google code) includes an open source data generator called MalGen. MalGen is designed to generated 10 billion to a 1 trillion synthetic events that represent realistic distributions, are distributed across a cloud, and present consistent information regarding the stylized compromises. Several of the distributions represent power law distributions: there are a few sites with many visitors and many sites with very few visitors and the variation follows a power law. In contrast, TeraSort (and its successors Gray Sort, Minute Sort etc.) simply generate random data for sorting, which is quite simple to do in a cluster. One of the challenges for data intensive computing is generating interesting very large datasets contains trillions and more events. A good fraction of the work was work on MalGen in order to support the MalStone project.