A possible "in" for virtualization in HPC

Print Friendly, PDF & Email

Server virtualization is not an idea whose time has come to HPC. One of the big economic motivations that have driven the adoption of virtualization in the enterprise — server consolidation — simply doesn’t apply in HPC where utilization is typically north of the high 80s. Timothy Prickett Morgan agrees wholeheartedly with this in an article at The Register in which he also points out that there is at least the potential for some adoption on the basis of configuration flexibility and manageability

It is possible that HPC shops wanting to run distinct applications on different flavors of Linux or a mix of Linux and Windows will use hypervisors allow for this configuring and reconfiguring more easily. But plenty of people are skeptical of the idea.

“The biggest use of virtualization is to allow multiple applications to run protected,” explains David Scott, petascale product line architect in Intel’s HPC platform unit. “This is potentially an area. Customers are thinking about it, but no one has done it yet.”

I agree that there is a potential for this, but we aren’t seeing any demand at all in the day job (very high end scientific HPC center), and I suspect that if I proposed it I would have to sit in the penalty box for a week.


  1. Jason Riedy says

    Of course one of the standard HPC centers would dump on this idea. The major centers all are based on government-money-scale machines. Those are from a single vendor, and that vendor must extract as much money from support on these one-off machines as possible. Thus they must reduce their own support costs, and that implies making the machine as uniform as possible.

    A *computation* center serving mid-scale users (32-128 nodes) that are refugees from the major HPC centers might be a better target. Users could state the OS,etc. their code already runs on, and the center can try to provide it. They can have minimal cost migration from the HPC centers. The centers demand larger and larger jobs for access, but many users have fixed-size jobs that they want to run faster.

    Plus, virtualization may allow non-forklift upgrades. Swapping in new machines may not require *any* user-visible changes for existing code and platforms. New code/platforms can take advantage of new features.

    The deadly part about virtualization is the added (and *variable*) latency. Bandwidth is manageable (RDMA passes through cleanly), but the extra layer of system calls will hurt latency-sensitive codes. Latency-tolerant codes may work well, but performance tuning will be a horribly difficult. So users more interested in short time-to-solution (combination of short development cycles and faster hardware cycles) could benefit. Again, these are not the target of the govt HPC centers, where long development cycles and Herculean tuning efforts reign.