Shai Fultheim: The parallels between network, storage and server virtualization

Print Friendly, PDF & Email

Some people see virtualization as a hindrance to performance with no place in HPC. Other people see it as the future of accessibility in HPC, opening up new modes of parallel computing for a non-traditional audience. Either way, it always gets a lot of attention from insideHPC’s readers. So when Shai Fultheim, the CEO of HPC virtualization company ScaleMP, offered to write a post for providing some background into the major virtualization approaches and the potential benefits for HPC, we took him up on it.

Think he’s right? Think virtualization is a waste of time in HPC? Sound off by leaving a comment at the end of the article! (Disclosure: we’ve run articles with Shai before, and always appreciate his perspective. This is not a paid piece, but you should know that ScaleMP is advertising at insideHPC at the time of this writing.)

The Wikipedia defines virtualization as “a broad term that refers to the abstraction of computer resources.” Another definition, also from Wikipedia, is “a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications or end users interact with these resources.” There are many types of virtualization that focus on the different computing resources: server, network, storage. Virtualization is becoming a key technology for the evolving IT landscape and, as cloud computing emerges, promises to be a critical component of making that a reality. In this article, I will focus on virtualization techniques associated with partitioning and aggregation of the computing resources. There are remarkable parallels between the network, storage and server virtualization paradigms. I will conclude by discussing the rising trajectory of the new emerging server “aggregation” virtualization technology.

Storage Virtualization

Storage virtualization refers to the process of abstracting logical storage from physical storage. The two key types of storage virtualization are partitioning and aggregation.

Partitioning is the logical division of a disk subsystem so that data can be isolated in separate segments (partitions). These partitions are logical storage units distinct from the physical storage units within the array. There are a number of benefits to storage partitioning. First, it delivers significant business value and ROI since managing a consolidated, single large physical resource enables improved efficiencies, which lowers storage investment. Second, it delivers more efficient storage utilization, as there is no need for extra capacity on a separated, dedicated storage device for each server; available capacity is allocated to servers as needed. It also provides more efficient storage management, as administrators spend less time and money managing storage with fewer disk systems needed to support many servers. Finally, storage partitioning offers improved storage flexibility by de-coupling servers and storage: administrators can easily add storage, new servers, or retire older servers without storage unload/reload or downtime.

Storage aggregation, on the other hand, allows pooling of physical storage from multiple network storage devices or arrays into what appears to be a single “virtual” storage device (or volume) that is managed from a central console. This approach delivers improved storage flexibility by allowing an application access to additional storage resources beyond the physical boundary of the storage array. In addition, underutilized resources in a pool can be reallocated to other applications or users. Physical storage resources associated with a particular application or host group are able to be allocated to other applications and host groups based on under- or over-utilization more efficiently, and available capacity can be easily allocated to servers as needed. Multiple independent storage devices that may be scattered over the network appear to be a single monolithic storage device, which can be managed centrally. This approach also offers better availability through non-disruptive data migration; ability to migrate data while maintaining concurrent I/O access. Some of the uses are for moving data off an over–utilized storage device, moving data onto a faster storage device, and migrating data off an older storage device.

Network Virtualization

Similar virtualization techniques are used in networking. Network aggregation is used to consolidate multiple physical network connections into a single virtual link with higher bandwidth or higher reliability. It is also quite common to partition (segment) large network switches with VLANs. The use of VLANs increases flexibility of the underlying infrastructure compared to use of dedicated switches.

Server Virtualization

Server virtualization separates the physical characteristics of the servers from the operating system, application, and users. Similar to storage and networking, the two key types of server virtualization are partitioning and aggregation.

The use of server partitioning has become quite common in the past few years, and is the darling of enterprise IT right now. With this approach, a single physical server appears to function as multiple logical (virtual) servers. This approach delivers significant business value, since consolidation enables improved efficiencies, lowering the number of servers required and also hardware maintenance costs. It also offers more efficient space and power utilization as a result of server consolidation, and can result in improved flexibility: one can deploy multiple operating systems on a single hardware platform (Windows, Linux, etc). Finally, it provides lower management costs by having each application within its own “virtual server,” and preventing one application from impacting another application when upgrades or changes are made. A standard virtual server build can be easily duplicated, which will speed up server deployment.

Using virtualization for aggregating multiple servers — abstracting them as a single logical system — was on the virtualization wish list for many users for a long time. Server aggregation delivers significant business value and ROI by allowing scaling of applications beyond the physical boundaries of a machine, and enables running of larger jobs that require more computer resources (i.e., cores and memory). Systems aggregated from standard servers can be significantly more cost effective than traditional larger servers, delivering lower acquisition costs and lower maintenance costs.

Aggregating system reduces fragmentation of resources, and also allows the use of the latest generation of “cooler” and more efficient processors, resulting in improved space and power utilization. It also results in lower management costs of clusters, grids and clouds by allowing administrators to install and maintain a single operating system image. The complexity of high-speed networks (such as InfiniBand) is masked, and the need for cluster file system and complex storage management is eliminated. This approach delivers improved flexibility and versatility, by being able to run both parallel and multi-threaded codes (OpenMP) with excellent performance, and also enabling running large core count, large memory jobs. Finally, it delivers faster and more cost effective storage options for “scratch” storage, which can be aggregated from multiple systems. Faster “scratch” storage is a critical element of performance in HPC applications.

As seen from the above descriptions, there are significant similarities in the value propositions when you compare the storage and network virtualization paradigms to server virtualization in both partitioning and aggregation. The potential of aggregation in server virtualization promises no less a transformation than what it delivers in storage and networking. Aggregation virtualization is a revolutionary change in paradigm and promises to have a significant impact on the industry in both storage and server segments.


  1. […] Original post:  Shai Fultheim: The parallels between network, storage and server … […]


  1. Server partitioning to improve efficiency makes little sense for HPC when there are no physical resources to consolidate. Actually there are (due to memory and network latencies), but not at the granularity of current vitualization implementations. Socially, throughput-optimized systems are a though sell in HPC, users don’t want to increase their wall time by an order of magnitude, even if everybody win in the end.

    Virtualization could be useful for application packaging, it would allow the ISVs to solve dependencies issues, licensing, testing, etc.

    Server aggregation (ScaleMP core product) is like automatic parallelization, it’s a long running joke that gets periodically rewritten. It used to be called DSM (Distributed Shared Memory) a decade ago. Nothing much as changed since, so what we learned from DSM still apply: if you care about memory locality, you need to express it with a message-passing paradigm, otherwise your performance sucks. HPC applications usually cares a lot about locality, otherwise everybody would use Grids. Sorry, Clouds.

  2. @Patrick

    [disclaimer: we have no current business relationship with ScaleMP]

    DSM actually worked quite well, as long as you had threads that paid attention to their own processor and memory affinity. Then message passing was handled implicitly by the DSM, not explicitly. In these cases, performance did not “suck”, on the contrary, performance was quite good for some codes. Yes, you had to go through some effort to do this. Far less effort than explicitly recoding in a message passing paradigm IMO.

    Part of the reason for the effective passing of DSM from the scientific computing community had been the cost of DSM versus the benefits of DSM as compared to the alternatives. Once you could reduce the cost of the distributed memory version, the shared memory version required a far higher priced machine to run on … at this point, the writing was on the wall as it were, about the future of DSM machines. Indeed, you find large DSM machines in a dwindling number of locations.

    Distributed memory itself as a paradigm expressed in use with message passing models is not a panacea, and it has a number of fairly important problems that haven’t been adequately addressed or solved to date. Fault tolerance/ability to withstand a rank failure, checkpointing, etc. No solution is perfect, and you do pay a price for every solution, in terms of real costs of effort to use the solution.

    This said, DSM and ScaleMP can solve problems that are not so easy to express in terms of distributed memory. The major one are very large memory problems that are not economical to attack on distributed memory, due to the time/effort/economic cost of rewriting the application.

    What is the cost to purchase a single machine with 1024GB ram versus the cost to purchase 8 machines with 128 GB ram? Next question in this reasoning is, what is the cost of rewriting and revalidating the application on a distributed memory message passing paradigm versus using vSMP to aggregate the machines together, get a large single machine with many processors, and a single large memory space, requiring no recoding to make use of it?

    Yes, you can run MPI codes without vSMP. Thats not where we think it is targeted. We believe vSMP is aimed at providing a large aggregated machine that you don’t have to pre-pay for the cost of that aggregation in hardware terms. This keeps the cost of aggregating down, at the expense of losing the optimized hardware interconnect.

    It lets you create large memory systems for running your threaded applications. Not necessarily in parallel.

    Shai and his team might wish to respond as well, but as I noted, my thoughts are that the distributed memory versus shared memory debates are basically over, have been for a while. With the smaller machines rapidly evolving into multi-core monsters with small memory (100GB or less) and huge numbers of processors, vSMP provides a way to get to 1000GB or more ram in an economical manner.

  3. Joe,

    Right to the point. vSMP Foundation (or compute aggregation) is used by customers to get the following 3 main benefits:

    – Alternative to traditional SMP systems required for VERY LARGE memory or VERY HIGH core count. vSMP Foundation is the most economical way to run genomics apps requiring TBs of RAM. It also the most economical way to run threaded apps requiring 64/128/256 cores.

    – Cluster management. No doubt distributed apps are more common than shared-memory ones. Still, there is no reason that should force the end-user to run cluster. There are few that enjoy playing with technology (including myself), but in general virtualizing a cluster as shared memory system reduce many of the cluster complexities (scheduling, interconnect management, clustered storage, etc.).

    – Clouds / grids. More organizations are interested in infrastructure where x00 to x000 nodes can be managed dynamically, including creating resources that have memory or compute pools that are larger than the single physical box “on the fly” for limited time.

    All the above might be possible to do in other methods… still I’m convinced virtualized, aggregated compute is the right way to go.

    Patrick, feel free to drop me a note… I’ll be happy to have more serious discussion. try me: Shai at ScaleMP dot com.

  4. Commoditization of server architectures has clearly impacted SMP approaches due to SMP high cost basis of HW, set up and the fact that SMP clusters are dedicated systems. As pointed out, some applications are well suited for a shared memory (SMP) model, but a much broader set of problems are addressed by alternatives such as memory virtualization. Memory virtualization makes memory a shared network resource available to all nodes in the data center. In the context of virtualization for HPC, what is interesting is the ability to dynamically construct or provision a shared memory environment within a cluster, based on application requirements. One true benefit of virtualization in the context of an HPC cluster is the ability to support either a shared or distributed memory model, dynamically, and without sacrificing performance and scale.