"…and in the darkness bind them."

Print Friendly, PDF & Email

From an IBM TJ Watson research paper on what else you could do with a Blue Gene:

IBM logoProject Kittyhawk’s goal is to explore the construction and implications of a global-scale shared computer capable of hosting the entire Internet as an application.

Jiminy.

Ashlee Vance at The Register ran across an IBM research paper on project Kittyhawk, an initiative to hammer the Blue Gene/P into a system that can host Web applications and built to a scale that could host the entire internet. Nicholas Carr also picked up the thread and commented on the paper.

The project addresses what IBM sees as fundamental flaws in the “Google” model of hosting internet-scale applications on commodity PCs

At present, almost all of the companies operating at web-scale are using clusters of commodity computers, an approach that we postulate is akin to building a power plant from a collection of portable generators. That is, commodity computers were never designed to be efficient at scale, so while each server seems like a low-price part in isolation, the cluster in aggregate is expensive to purchase, power and cool in addition to being failure-prone.

…We postulate that efficient, balanced machines with high-performance internal networks such as Blue Gene are not only significantly better choices for web-scale companies but can form the building blocks of one global-scale shared computer. Such a computer would be capable of hosting not only individual web-scale workloads but the entire Internet.

According to the paper early results are “promising.”

Why do you care? Well, it’s the whole internet for crying out loud.

But if that isn’t enough, then consider this. Let’s assume it isn’t the whole internet, but it is really really big. Lots of companies would use it (or it wouldn’t be really really big). As companies move their hosted web-facing applications to this environment, they will have a built in solution provider for their SMB HPC needs. Since this machine started as a scientific supercomputer, a system like this could also serve as a nucleation site for the condensation of small and medium-scale technical HPC requirements as well.

A problem with the Google cluster is that using it for technical HPC typically means re-imagining the scientific applications that need to run on it. Not that this is a bad thing — the community needs to re-imagine its applications for million-core computers anyway — but we haven’t had the impetus to do it yet.

Trackbacks

  1. […] and the IBM researchers are looking to apply HPC scale resources to the problems of hosting the web here.  This might be an interesting re-incarnation of the Big Friggin’ Web Tone Switch that […]

  2. […] MEDDESKTOP wrote an interesting post today onHere’s a quick excerpt…clusters of commodity computers, an approach that we postulate is akin to building a power plant from a collection of portable generators. […]

Comments

  1. What is interesting to me in the original Register’s article is how IBM people noticed, that programming for web using stateless tools such as http is in fact (implicitly) parallel programming – because each little request gets served independently.
    However, clusters _are_ designed to run many smaller jobs efficiently whereas large SMP machines are best with large applications. Of course they can often outperform a cluster for small jobs as well, but their high price usually convince people to use clusters for their smaller applications and buy SMPs for big stuff.

    -marek


    clusteradmin.net :: blog about building and administering clusters