What came before the Cloud, and why those failures matter now

Paul Wallis, writing over at the SEO/SEM Journal, looks at what came before the Cloud and why the difficulties those technologies encountered matter now.

Looking at the Cloud’s forerunners, and the problems they encountered, gives us the reference points to guide us through the challenges it needs to overcome before it is adopted.

Wallis starts with the rise of the datacenter

In the past computers were clustered together to form a single larger computer. This was a technique common to the industry, and used by many IT departments. The technique allowed you to configure computers to talk with each other using specially designed protocols to balance the computational load across the machines. As a user, you didn’t care about which CPU ran your program, and the cluster management software ensured that the “best” CPU at that time was used to run the code.

Then moves on to Grids and identifies some of the organizational barriers that prevented widespread commercial adoption (which is what is driving all the sturm und drang over the Cloud these days; otherwise its basically HPC)

The analogy used was of the electricity grid where users could plug into the grid and use a metered utility service. If companies don’t have their own powers stations, but rather access a third party electricity supply, why can’t the same apply to computing resources?

…But, more important than these technical limitations, was the lack of business buy in. The nature of Grid/Cloud computing means a business has to migrate its applications and data to a third party solution. This creates huge barriers to the uptake.

…The other bridge that had to be crossed was that of data security and confidentiality. For many businesses their data is the most sensitive, business critical thing they possess. To hand this over to a third party was simply not going to happen.

Drawing on Jim Gray’s 2003 paper on the economics of distributed computing…

The recurrent theme of this analysis is that “On Demand” computing is only economical for very cpu-intensive (100,000 instructions per byte or a cpu-day-per gigabyte of network traffic) applications. Pre-provisioned computing is likely to be more economical for most applications – especially data-intensive ones.

…Wallis paints a gloomy picture

When Jim published this paper the fastest Supercomputers were operating at a speed of 36 TFLOPS. A new Blue Gene/Q is planned for 2010-2012 which will operate at 10,000 TFLOPS, out stripping Moore’s law by a factor of 10. Telecom prices have fallen and bandwidth has increased, but more slowly than processing power, leaving the economics worse than in 2003.

Even if the telecom issues can be overcome, we’ll still need a multi-tiered internet (where people with big needs and deep pockets pay for faster service than you and I get for loading slashdot) and to sell business on the idea that this is really all a Good Idea against a backdrop of recent failures at Amazon and the cutting of undersea cables in the Gulf.

The article is a good counterweight to the overly positive portrayal of the potential of cloud computing in the IT press, and is worth a read for an understanding of the challenges that big businesses will face in selling cloud computing to other big businesses.

There are two points the article doesn’t address with respect to HPC. At the low end HPC customers aren’t migrating an existing infrastructure; they are gaining access to resources they did not previously have. At the high end there is still the potential that large government and research-oriented facilities will be forced to outsource their HPC hosting based solely on the economic and institutional problems (read: government regulations) associated with building out $20M-$50M of HPC support infrastructure every couple years.

Trackbacks

Sponsored Guest Articles

Hammerspace Unveils the Fastest File System in the World for Training Enterprise AI Models at Scale

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA