IBM’s HPC Dreams are Tattered

Print Friendly, PDF & Email

In this special guest feature, Dan Olds from OrionX.net writes that IBM ambitions in the HPC market have fallen short.

Dan Olds from OrionX

Does IBM have a bright and shiny future in the HPC market? A recent article published in eWeek argues that IBM losing the last three DOE Coral acquisitions is just a ‘minor setback’ and that the company’s future in supercomputing is bright. Here’s the reality that I see today.

I was optimistic for IBM when they introduced their POWER8 processor in 2014. It had a high clock rate with speeds up to 4.15 GHz, plenty of cores, and high memory bandwidth, outperforming Intel processors handily on a variety of benchmarks and offering as much as 43% better performance than competing Intel processors.

The most important advance in POWER8 is that IBM converted it to Little Endian. This change, while not affecting performance, made the new POWER chip compatible with the x86 architecture. Now software written for x86 can run on POWER systems with a simple recompile in most cases and minor code changes in corner cases. Couple this with native Linux on Power systems and you have a game changer – IBM now had a real shot at being competitive with Intel.

The current POWER9 processor, announced in 2016, added more cores (up to 24 per socket), and pumping up performance by an estimated 1.25-1.5x higher than the preceding POWER8 CPU. Moreover, IBM was sticking to their POWER8 system pricing philosophy and retaining their system price parity with comparable x86-based machines.

IBM’s strategy of opening up the processor for use by other system and component vendors through the OpenPower Foundation was also a very good move. This would help propel POWER based systems to customers that IBM couldn’t easily address and build up an ecosystem much more quickly than IBM could on its own. Back when all of this was starting, I was impressed enough to dub them The Rebel Alliance and had was optimistic about their chances to succeed.

Armed with highly advanced HPC systems, plus an open platform, plus the ability to run any open source Linux application, plus price parity with x86 and a wide range of HPC expertise, you’d think that IBM would be able to carve out a significant position for itself in supercomputing, right? Well, not so much.
Momentum? Ummm…..

With a renewed slate of hardware and several powerful partners (NVIDIA, Mellanox, Inspur, etc.), IBM looked poised to roil things up in the HPC server market. However, looking at the latest Top500 list, which is a good proxy to gauge the success/failure of any supercomputing vendor, you’ll see that IBM hasn’t been roiling much of anything.

The author of the piece I referenced in my first paragraph seems to argue that the Top500 isn’t a good measure of HPC success. I disagree. Sure, the top slot changes from time to time and it’s very difficult and financially taxing to remain in the top five for a long period of time, but the rest of the 500 systems are very representative of how a particular vendor is doing in the market as a whole.

Here are some stats from the most recent list (6/19), broken down by number of systems on the list.


On the most recent list, IBM has a grand total of 15 systems still on the chart. Nine of these were installed in 2014 or earlier, which makes them elderly by HPC standards. Nine of the 15 are also either obsolete BlueGene or POWER7 systems or x86 based iDataPlex boxes that went away when IBM spun off their x86 business to Lenovo (who is doing quite well with it as you can see by the list). IBM is now the ninth largest Top500 vendor, as measured by systems, on the list…..and they need to watch out for Penguin Computing.

When you take out IBM’s obsolete Top500 systems, you’re left with: two computers which are the fastest supercomputers in the world (Summit and Sierra); the 10th and 11th placed Lassen and Pangea systems; plus two other POWER9 boxes at 247 and 411. If you’re arguing about total capacity, IBM takes the cake with Summit and Sierra, no question about that. But their success with these two titanic systems obviously haven’t sparked a groundswell of demand for the Power-based HPC servers. Why not?

Differentiation, Determination & Drive

Introducing a new architecture into the tech markets is difficult. I would argue that IBM has pretty good technology with POWER9 – definitely good enough to be competitive with x86 alternatives. But good technology isn’t enough.

I believe that IBM is failing when it comes to the most important factors in successfully launching a new technology:

Differentiation: IBM’s products are different from the competition, sure, but IBM hasn’t successfully told that story and proven it to the market. They haven’t done a very good job of trumpeting the advantages of their new processors/systems. In my travels to pretty much every high-performance computing event, I’ve rarely run into anyone who is enthusiastic about IBM’s POWER-based systems or optimistic about their chances to make significant inroads in the HPC market – this is despite their big wins with Summit and Sierra.

Determination: when launching a new technology, it has to be all hands on deck to push it to success. I certainly haven’t seen that from IBM when it comes to POWER. Sometimes it has been hard to tell from IBM’s website that they even still sell hardware. Power systems don’t seem to be all that important to IBM as a whole. Sure, you hear about system revenue during earnings season, but aside from then, Power systems disappear into the background of IBM’s push to cloud computing, quantum computing, AI software, and other initiatives. Why? Maybe because cloud lock-in and blue software stacks seem more profitable than system sales when looked at in isolation. But what IBM fails to see time and again is that system sales enable sales of their other, more profitable, offerings.

Drive: As Scott McNealy said, you have to put “all the wood behind the arrow.” IBM hasn’t done this with it’s Power-based systems. They should have been deploying their domain experts directly to customers to talk about how Power systems can solve their HPC problems more quickly and with higher accuracy. From personal anecdotes, I know that IBM systems marketing has been starved of funds. It’s as if IBM executives have the attitude “we build a faster processor and the market will come to us.” That simply isn’t true these days. And it takes more than a faster processor to win deals – it takes domain expertise, knowledge transfer from vendor to customer, and proof that your solution is superior to the competition. Other vendors, like Intel, NVIDIA, HPE, Cray, and Dell are doing this. IBM isn’t.

Learn from History or Become History

The company should have taken a page (well, pages) from NVIDIA’s book on how to introduce a new architecture into the market. In 2008 or so, NVIDIA first started publicly talking about using their GPUs as processing accelerators. At the same time, IBM was installing the first Petaflop supercomputer (Roadrunner) at Los Alamos and telling the HPC community that their approach of coupling Opteron CPUs with the incredibly hard to program Cell broadband engines was going to be the ultimate HPC platform for years to come. It wasn’t.

While IBM was pursuing that pipe dream, NVIDIA was busy building an ecosystem for their GPU accelerators. They worked with ISVs and steadily added more and more CUDA enabled programs, making it pretty easy for customers to use off the shelf software with GPU-enabled systems. They also spent a lot of effort proving their performance claims, tirelessly running benchmarks and publicizing the results.

NVIDIA has built a massive business in HPC with their GPU accelerators – products that were once a niche add-on for gaming enthusiasts. They did it by building an ecosystem, proving their value, and then marketing the hell out of the combination. IBM hasn’t done that with Power. As a result, NVIDIA has a highly successful accelerator business in HPC while IBM’s Power systems are barely selling into the Top500 in terms of system ship.

In fact, I believe the Arm processor has more momentum than the POWER processor today. There are already a couple of Arm-based systems on the Top500 list and more are sure to come in the near future since Cray now offers Arm-based supercomputers and has announced notable wins, NVIDIA has promised CUDA support for the chip by the end of the year, and Fujitsu continues to impress with its Post-K systems progress.

Without more commitment and support, IBM’s POWER processor runs the risk of becoming this generations version of the DEC Alpha CPU. It was a fantastic processor and ran rings around competitors, but ultimately failed because of a lack of marketing and operating environment support from DEC.

This article isn’t making the case that IBM can’t win in the HPC market and build up a big presence on the Top500. However, they have to strongly execute on the Three D’s I discuss above and provide resources for the systems group in order to build up positive momentum.

The danger of puff articles like the one in eWeek isn’t that they fool the world. These types of articles are read by both insiders and outsiders. Knowledgeable HPC customers and observers know that losing the latest DOE deals was a significant setback for IBM. This can’t be glossed over. They also know that IBM’s Power-based systems aren’t setting the HPC world on fire. But company insiders reading an article like this might take it as a signal that all is well and allow complacency to set it.
Losing these deals, plus the state of their HPC market share, should be a wake-up alarm for IBM, not a snooze button.

Dan Olds is an Industry Analyst at OrionX.net. An authority on technology trends and customer sentiment, Dan Olds is a frequently quoted expert in industry and business publications such as The Wall Street Journal, Bloomberg News, Computerworld, eWeek, CIO, and PCWorld. In addition to server, storage, and network technologies, Dan closely follows the Big Data, Cloud, and HPC markets. He writes the HPC Blog on The Register, co-hosts the popular Radio Free HPC podcast, and is the go-to person for the coverage and analysis of the supercomputing industry’s Student Cluster Challenge.

Sign up for our insideHPC Newsletter

Comments

  1. Barry Graham says

    This is the challenge we face today. News sources that were once seen as highly reliable and credible are now just places where anyone can express an opinion with actual facts taking a back seat . As for why IBM has slipped in the HPC rankings, the answer is clear when you look at the table above. It’s because they sold most of their HPC capability to Lenovo which, as a result, now occupies nearly 200 of the slots. Power 8 wasn’t up to the job. Power 9 is. I predict IBM will make a comeback.

    • Grumpy Lemur says

      Correct. When IBM divested the x86 business to Lenovo it divested all of its HPC expertise. The marketplace assumed they were now out of the game and for a while the company did nothing to persuade it otherwise. The problem both POWER8 and POWER9 have is that they’re not available in form factors that give them the core density to make them price-competitive against x86 alternatives. What POWER9 is very good at is Machine Learning through the tight coupling of nVidia GPUs with the CPUS and memory so it is being positioned for the convergence of AI and HPC. Unfortunately the acceleration of HPC codes is still too patchy to make P9 cost-effective for mainstream HPC.

  2. Andrew Melbourne says

    UGH! Really, Dan?? You think the Top500 is a true measure of HPC market success? You lost me the moment you backed a list which is shown to be fraudulent, filled with Chinese Telecommunications customers with no name listed, running ethernet (not for HPC workloads either, so don’t push that angle).

    The Top10 might be an accurate depiction, or even the Top25 but, then again it is zero relationship to actual whole market data – good look at Hyperion or Intersect360 data, as both of them contradict the Top500, yet are based on factual research.

    • Hey Andrew, you make some great points above and I’d certainly argue on your side on several of them. But even with the Hyperion and Intersect360 data (both of which are very good) that I’ve seen (along with my own OrionX survey data), I still don’t see IBM’s Power-based boxes making much of an impact on the market – aside from having their BIG systems, of course.

  3. I was also excited initially by the performance potential of the Power9 systems. The existence of BioBuilds for premade binaries of software packages also seemed like a good bet. But when I got on them and tried to port over my existing genomics workflows I immediately started hitting snags. Basic software packages, even ones offered in BioBuilds, kept having mysterious glitches and crashes. The software developers were unable to help since they weren’t familiar with Power architecture. IBM personnel tried to help but ultimately couldn’t figure out either. It seemed like IBM’s tagline for this was to highlight massive speedups on common tools like GATK and BWA when using the 160 threads offered by Power9 Simultaneous Multithreading. But they seemed to gloss over the fact that standard genomics workflows require far, far more software packages than just those two, some of which tend to be archaic and esoteric, and it’s just not possible to get them to work on Power. Also the fact that in production settings, running one algorithm on one sample with 160 threads is wasteful and not nearly as appealing as e.g. running numerous algorithms on numerous samples with 4-8 threads each; something that Power does not seem to actually be optimized for. Ultimately we had to abandon the notion of using IBM because things we needed didn’t work. And then AMD Epyc came out and I just don’t see the point of Power9 anymore when I could run the software I already have on x86_64 nodes with similar specs on paper.

  4. As the guy who is the “anyone” who supposedly expressed an opinion where “actual facts taking a back seat”, I guess I’ll respond. I’ve been in high tech on the server side since 1994 and been an independent industry analyst since 2001. Having worked for Sequent, Cray/Sun, and, yeah, IBM, I’ve been around the block a few times and have the t-shirts to prove it.
    You should check out the historical Top500 lists and see how many x86 systems IBM had on the list before they sold that unit to Lenovo. You’ll find that Lenovo has added the vast majority of their current Top500 systems after being set free by IBM.
    Power 8 was a highly competitive processor when it came out as was Power 9 when it was introduced in late 2017. But IBM still isn’t making much headway on the list with only six machines that are running Power 8/9 – which was sort of my point.

  5. IBM could have leveraged the AC922’s into the HPC visualisation field yet once again made a fatal decision.

    For some unknown reason decided not to have OpenGl or Vulcan supported.

    The worst part they never informed their clients that had ordered and purchased the AC922’s.

    Even their own tech support was unaware!!!

    So instead of large sells of the units to clients that needed that visualisation capability – once again IBM stands for “It’s Been a Mistake”.

  6. Lorenzo Moldar says

    Power should be sold to Nvidia by IBM. Power can remain a legacy ISA, but an ARM fork of future Power micro architecture should be made and become the basis for Nvidia’s ARM-based HPC. If this sounds strange consider that Fujitsu’s Post K ARM HPC was a recycling of a SPARC based architecture.

    Power has no future. This is elementary. IBM should sell Power assets to NVidia.