Q&A with HPC virtualization software maker ScaleMP

Print Friendly, PDF & Email

Businesses, especially the HPC business, are in a constant cycle of destruction and creation. A market stabilizes (sometimes only briefly) and then is abruptly tilted into turmoil by some new dynamic in the customer base, or by the introduction of some new technology. HPC is certainly in a time of change right now both on the technology and business fronts as the ailing economy pushes marginal businesses over the edge into bankruptcy. At insideHPC we’ve been on the lookout for the companies that might ascend as a result of these most recent market changes.

This week we had a chance to talk with Shai Fultheim, the CEO of HPC virtualization software maker ScaleMP, about his company, its technologies, and why this particular moment of change is turning into such a boon for them.

insideHPC: Give us a little background on ScaleMP. What do you do?

Shai Fultheim: ScaleMP provides virtualization solutions for high-end computing. Its virtualization solution combines multiple x86 systems into a single virtual machine (VM) – aggregating the CPU’s, memory and I/O of all the physical machines – resulting in large memory / high-core-count virtual SMP system (think “reverse VMware”). Using software to replace custom hardware and components, ScaleMP offers a new, revolutionary computing paradigm. vSMP Foundation is a software-only solution that eliminates the need for extensive R&D or proprietary hardware components in developing high-end systems – thus reduce the overall solution cost. vSMP Foundation can be used in conjunction with cluster to reduce cluster operational expenditures.

vSMP Foundation aggregates up to 16 x86 systems to create a single system with 4 to 32 processors (128 cores) and up to 4 TB of shared memory. It is available for server offerings of Appro, Cray, Dell, HP, IBM, Intel, Sun and Supermicro.

insideHPC: With market uncertainty around solutions from the new SGI, and especially the availability of the planned next generation Xeon-based shared memory system (UV), many HPC customers are in a bind with respect to addressing their HPC requirements, particularly for large core count x86 systems with tons of memory. Is ScaleMP able to take advantage of the opportunity presented by this?

Fultheim: Yes –- and in fact, ScaleMP already has customers that have been running 128-core Xeon systems for over a year. Our virtualization solution, vSMP Foundation, provides the largest and the fastest Xeon system today with 128 cores and 4TB RAM. It will be expanded in the very near future to support 1024 cores and 64TB of memory. This virtual SMP solution is an excellent choice for any HPC user, and has outperformed existing SMP systems, while keeping price points well below these systems. I expect that price gap will remain compared to the future SGI UV or similar systems.

When you think about it, when it comes to scalable x86 systems that really address HPC requirements, our solution is the only solution in the market today. Customers should be careful when selecting solutions that are not yet available and have uncertain delivery schedules.

We continue to get inquiries from customers who had committed to SGI, but are now worried that they will not be able to meet the planned timelines, and are looking for ways out of their predicament.

insideHPC: Is ScaleMP’s ability to deliver shared memory systems for customers only a good option if SGI (or someone else) isn’t building hardware-enabled shared memory, or are there advantages such that it makes sense even if the new SGI continues with it’s Altix/UV roadmap? Is there a sense that the support that chip makers are building in for virtualization technology will deliver performance benefits for the “reverse virtualization” approach too? In other words, will future chip features to support virtualization make your solution even more viable?

Fultheim: In simple terms, “The IT world is virtualizing itself, and this direction will continue!”

Virtualization allows customers lots of freedom: our customers can choose to purchase 128 cores shared memory systems from Dell, IBM, HP, Sun, Supermicro, Cray, Appro and others. Virtualization allows increased flexibility: customers can start small by just connecting two systems to get four-socket Nehalem solution and grow over time. Customers can decide to scale only the memory of the system without having significant investment in processors. Lastly, I would say that virtualization allows customers to always be on current technologies in that supporting the latest generation of processors requires only a software change rather than an entire system change.

There are 3 components to our solution, where the performance of one of them driven is by ScaleMP and the other two driven by global IT trends:

  1. Intel is reacting fast to the (enterprise) virtualization market. We are seeing significant increase in the performance and feature set of Intel processors.
  2. The performance of high-bandwidth, low-latency interconnects, such as InfiniBand will continue to improve. InfiniBand provides better performance characteristics than majority of proprietary interconnect fabrics. This trend will continue in the future.
  3. Lastly, vSMP Foundation’s advanced caching technology is always-progressing, and has enabled us to win performance benchmarks against machines such as SGI Altix for the past 3 years.

These 3 trends will remain, promising that our virtualization solution will continue to deliver superior performance compared to traditional SMP systems.

Keep in mind that a significant number of our end-users are actually using the technology as a way to upgrade small to medium-size clusters to virtual SMPs. These end-users benefit from a significant simplification to their cluster infrastructure, with fewer and larger compute nodes, as well as reduction of cost resulting from the use of internal drives rather than clustered storage. Today’s SMP solutions (as well as future products, like UV) are not addressing this market segment, hence these benefits are available only by virtualization solutions.

insideHPC: What is your perspective regarding Nehalem and how do you see its impact for HPC customers and the industry?

Fultheim: Nehalem is part of a complete system architecture that has a couple of interesting promises for HPC customers.

First, it brings NUMA solutions to the mainstream x86 architecture! It means that ISVs need to plan for parallel scalability which is NUMA-aware. We have been saying for long that this is most cost-effective way to scale systems, even in the x86 space. vSMP Foundation further expands Nehalem’s NUMA system architecture with aggressive caching that improves the overall performance of the solution.

Secondly, we are expecting to see larger x86 systems in the 6 to 12 months time frame which can be excellent building blocks for even larger solutions leveraging our technology. When you aggregate 16 systems, each with 4 sockets and future capabilities of 8 cores and 16 threads – you will get lots of processing power.

Lastly, the Nehalem story is also about improved I/O and PCI-express performance. This is paramount to efficient interconnect performance, and main reason for the improved overall performance we are seeing with vSMP Foundation on Nehalem deployments. We have evidence that, for applications on a single-system level (2 sockets), Nehalem shows only modest performance improvement of 5 to 10 percent, but when leveraging vSMP Foundation to scale the solution from 1 to 8 systems (total of 16 sockets), we have seen about 30% performance improvement compared to deployments with previous generation systems. This is huge.

insideHPC: What is your perspective regarding the economic situation we are in now, and how it is impacting the buying behavior in the HPC market segment? Is the new administration’s support for science and engineering, and the stimulus package, having an impact on your business?

Fultheim: The macro economic situation has impacted the HPC segment negatively since the 4th quarter of 2008, and 2009 continues to be weak. What we are noticing is that the commercial segments have been hit harder and we are seeing many organizations trying to reduce the CAPEX and OPEX of HPC projects, which leads them to seek more cost efficient solutions. On the other hand, we are seeing that the public sector (higher-ed, government) is more resilient, primarily due to the stimulus package, which will have a positive impact on the HPC business in the second half of 2009. Many projects are in the pipeline with grants and applications for this stimulus money, which is expected to start flowing in the second half of the year.

In addition to a demand growth by shared-memory customers, we are seeing significant momentum with our offering for cluster management. Many organizations, specifically in the public sector, are seeing increased benefits in running fat-node clusters rather than traditional clusters. With the increased requirements for faster deployment, large memory jobs, and ease of management and use, more IT organizations approach us interested in deploying vSMP Foundation on their HPC clusters for management and flexibility.

insideHPC: Your product is available from Dell, HP, SUN, IBM and recently with Cray. Whats next?

Fultheim: From a vendor perspective, we had a good partnership with SGI at the past, and I am hoping that with a new management and business focus we will be able to see SGI offering our solutions to its customers again. I believe that the strong expertise of SGI in shared-memory systems, coupled with Rackable’s x86 product-line excellence and our software solution will provide customers with more choices available in scalable x86 solutions.

We are partnering with several of the Tier-1 vendors to offer more customized and integrated solutions into their product portfolio. This will be announced in the future, so stay tuned.

On the product side, we continue to focus on enhancing our product to help our customers meet their ever-increasing HPC requirements. A few upcoming enhancements worth noting are:

  1. In the short term, our engineers are working on enhancing vSMP Foundation Direct Connect capabilities. Direct Connect 2 (DC2) will allow connecting up to four Nehalem systems without the need for an InfiniBand switch. This would allow support for an 8-socket Nehalem system with 192 GB of RAM for under $40K. This would be a very attractive entry-level HPC solution.
  2. We are also working with Intel towards supporting Intel Nehalem-EX systems, and in conjunction with DC2 to allow larger shared memory systems with even more cores than available today.
  3. In addition, we will also be expanding vSMP Foundation to support more than 16-nodes, allowing creation of shared memory systems with up to 64 TB of RAM and 1024 cores (or more). This will address customer requirements for even larger systems than today.

insideHPC: There is a lot of buzz around cloud computing. How do you see the cloud picture emerging? Is ScaleMP involved in this space?

Fultheim: We are finally seeing cloud emerging as a serious computing alternative in the enterprise segment. Amazon’s success in this segment is proof of that. Today’s clouds are optimized for enterprise apps, and not so tuned for HPC. HPC clouds require flexibility in memory and compute capabilities, where enterprise clouds are falling short.

Here is where the virtualization “aggregation” paradigm pioneered by ScaleMP comes in. With vSMP Foundation, cloud vendors can build compute infrastructure required for HPC on the fly using standard server building blocks, and this in our humble opinion will be one of the important components of making the HPC cloud a reality. We are partnering with cloud computing providers to allow them dynamic provisioning of large memory / high core-count virtual systems.

Comments

  1. So Nehalem “brings NUMA solutions to the mainstream x86 architecture” does it ? Either been living under a rock these past 6 years or he’s been drinking too much of the Intel koolaid..

    Time to wheel on Linus’s quote from 2004 when Intel announced their AMD compatible 64-bit architecture..

    Actually, I’m a bit disgusted at Intel for not even _mentioning_ AMD in their documentation or their releases, so I’d almost be inclined to rename the thing as “AMD64” just to give credit where credit is due. However,
    it’s just not worth the pain and confusion.

    Intel’s HyperTransport, er, QPI, is nothing new and I’m really puzzled why he would think it was different..

  2. Robin Clash says

    Chris,

    Your comment about QPI has no relevance to the article, which is about building large shared memory systems from COTs servers.

  3. My comment about QPI was about the interviewee saying that *Intel* were bringing NUMA to mainline x86 for the first time through Nehalem, completely missing the fact that NUMA based Opteron systems been there for 6 years.

    Re-reading I suppose it might mean that it’s bringing NUMA to *Intels* mainstream products, and if that’s the case I apologise, but it didn’t come across that way.

    I’m well aware of what ScaleMP make. 🙂

  4. Robin Clash says

    I think the latter, but your point about Opteron is of course correct.

  5. This all sounds great but, ask Shai how one can benchmark a system. I’d be curious to see what ScaleMP tells others about requests for customer bechmarking (eg NOT done by ScaleMP but by actual customers)

  6. Robin Clash says

    FJW you sound skeptical

  7. John West says

    FJW – I’ll ask Shai and we’ll hopefully get him to post a comment here.

  8. Chris,

    You are right – AMD was there first. Unfortunately, in order to be “mainstream” in x86, due to its market share, Intel needs to play as well.

    Therefore *Intel* “brings NUMA solutions to the mainstream x86 architecture”. AMD for years offering us all excellent NUMA solutions – and was the first to embrace it. Intel recent adoption of NUMA architecture brings it to mainstream and we can expect for broader ISV acceptance (NUMA awareness, optimizations, etc.).

  9. FJW,

    *Customers* (i.e. people/organizations that pay money to use the product) have successfully benchmarked our product and technology many times. It passed many acceptance tests and deployed all over the world (3 digits for the number of different sites).

    I will be happy to provide you with some customer benchmark data (send email to Shai at ScaleMP dot com).

    So – the answer is that you need to buy it to “benchmark” it. A small-scale deployment costs much less than what a serious prospect would invest in benchmarking (in man-days).

    P.S.
    ScaleMP does not engage with “folks who are interested in testing-only”, as we have a (successful) business to run, and focus our resources on revenue-generating activities. I am aware of some other companies that are more open with “testing-only” scenarios; unlike me, I guess they enjoy the chapter 11 rollercoaster 😉

  10. Shai,

    That’s pretty much what I expected you to say.

    Good Luck with that.

  11. In defense of Shai’s position, we have found, many … many times, that in “testing” or “try and buy” scenarios, our solutions were being used specifically to ratchet up pressure on the customers preferred solution. So while we were told that we were getting a fair evaluation, the reality is that we were being used as a 2×4 to beat up a competitor on price (and performance). In one case, the evaluator even wrote up a white paper on the project, all about testing the competitor’s machine, mentioning ours in passing, and ignoring all of our advice on configuration. Unfortunately, it was represented to me before the evaluation, that it would be an open and fair evaluation relative to various vendors. It wasn’t.

    So we have taken a different tack with regard to evaluations. We have engineering/demo units in-house that we use, and in some cases, will loan out to customers. We will not pre-configure our largest possible machine, and send it out with a try and buy. And yes, we were asked to do this in the last two weeks. If we have a unit in the lab being built, we can give access to it for a window for tests. And we do. Most customers whom are serious about purchasing units from you have no problem with this. Customers who want to beat up their preferred vendors? They don’t like this so much. Removes the shock-factor when the preferred vendor reps walk through the data center and see your logo sitting where they think theirs should go.

    Shai rightly points out that vendors that have been doing this in the past, are being bought for pennies on the dollar at Chapter 11 auctions, or are being forced to sell themselves to others to continue operations. There is a reason why … bad business practices do not make for long term surviving companies.

    We (Scalable Informatics) have on our plans to purchase a license soon, and like with our other units, may open this up to specific customer testing. Don’t beat Shai up for running with sound business practices. Do beat up the companies that lend gear out with no hope of recouping the cost of that gear. Those are companies that won’t be around in short order.

  12. Joe,

    With all due respect, I’m not “beating” up anyone. I’m simply stating that I believe the customer (and btw, my pockets are not empty) should be able to independently verify results. Whether that is through a loaner, a try and buy, access to existing boxes or consultation with existing trusted sources, I care not.

    What I do take exception to is the thought that anyone that questions results, no matter how extraordinary they are, should be accosted for suggesting that results need to be independently verified. I’d be perfectly happy to let you (hypothetically) propose benchmarks results, then structure a contract that compensates for missing proposed results (or rewards for that matter, if you surpass).

    But what do I know, I just buy, install and run the stuff… I don’t sell anything. I’ve leave that to you smart guys.

  13. @FJW

    Sorry about coming off somewhat strong, this is a bit of a hot button for us. We find that the folks who don’t take us seriously try to use us as a blunt instrument against their favorite vendors, wasting our time and resources, in order to get a better deal from their favorite vendor.

    I don’t think Shai is saying don’t benchmark. I think he is saying that you have to pay for his software if you want to use it for any purpose. That’s sound business IMO. I would like to see a cluster they maintain with their software accessible for testing, but we may have to bear our own hardware costs for this. He is running a lean business, and hardware isn’t cheap.

    We let our customers into our firewall to get access to machines we are building, for their testing. Also we let them in to experiment on machines we are working on, and try to give them a feel for what the performance difference is for what they want.

    By letting the customer into the firewall to beat on a machine, we let them show us what tests are important to them. They help us understand what data they are looking for, and then we try to generate it. It also doesn’t cost us anything apart from a little time, and we gain as much as we spend.

    Shipping a machine is generally possible. However, as noted, some like to use us as a blunt instrument. Which means we waste money/time/effort shipping the machine, we lose on time for alternative uses of the machine. And if we have invested in building the machine rather than having it part of a committed purchase contract, we have to ask ourselves what the risk of potential return is. For more common and lower cost gear, this isn’t nearly the issue it is with some of the very specialized gear, that would be hard to resell, and is of less use for us for development/testing purposes.

    I am all for real benchmarks. Of course, every now and then we get people running ‘benchmarks’ which don’t actually measure what they purport to measure (being say, purely cache bound rather than IO bound) … part of this is working with the people to understand what their real use case is, and testing as close to this as we can, and the other part is to ‘detox’ the FUD/misinformation out there.

    More to the point, we are very open about what we test, how we test, etc. More often than not, customers tell us they get the same numbers to some precision that we have measured. If this is the case, then we have done our job well. Because if they get different numbers, then this is a diagnostic event detection, that means something changed, and usually not the way it should.

    As for structured contracts, allow me a grimace. I have never … ever … seen a reward for exceeding expectations, and have only seen (very one sided) penalties for missing them. I have never seen a government or research/edu institution accept late fees for paying late. We incur *significant* risk when loaning money like this, and there is no reward we can collect to pay for that excess risk. This has been abused multiple times. So we now review every request for credit (requiring applications from everyone who requests credit) very carefully, and offer alternatives with incentives for taking them. All of this is spelled out in our quotes these days. Its risks like these that eventually drove LNXI under.

    If you’d like to talk about this offline, my email is landman _at_ scalableinformatics _dot_ com.

  14. Joe,

    Thanks for the clarification. I agree with all of what you have written/stated. I also have not seen rewards, but I personally think that if I have a big stick for a vendor, I should at least have an equal sized carrot… but I agree that the sticks in the HPC space greatly outnumber the carrots.

    Cheers