Steven J. Vaughan-Nichols at ComputerWorld has an opinion, expressed vigorously in his High-performance Nonsense piece this week
Microsoft, after spending decades paying no real attention to high-performance computing, wants to be an HPC player with the release of HPC Server 2008. Can you believe it? Yes, there was Windows Compute Cluster Server 2003. After a long search, I found one user. He told me, “Updates that require reboots are far too frequent for production-use systems,” “Jobs randomly crash,” and “Few HPC applications actually support Windows compute nodes.”
Will HPC Server 2008 be any better? I don’t see how it can be, really.
It’s nice to have comments from an actual user. Attributed comments and some details about the environment and kind of machine would have made the comments a lot more valuable, but still, they are a useful point of reference. Vaughan-Nichols goes on to point “Windows’ historical baggage of bugs and bloat,” as well as driver signing as potential issues for HPC users. The driver signing is a potential issue.
But — and I say this as someone who has spent his career running some of the largest supercomputers on the planet, including the current #28 and the largest supercomputer in the DoD — “bugs and bloat” is not an issue for the targeted population of users. The sweet spot for HPC Server is not those already running large HPC installations, or even those already running Linux clusters. It’s for those who, because of Windows-only corporate policies, inexperience, or ISV lock-in, have no cluster or HPC capability at all right now. Think of HPC Server like the handheld videocamera of computing: it doesn’t replace a professional piece of cinematography gear, but it does open up video capture to millions of non-professionals who want to record movies on their own.
He ends with this observation
Windows often requires you to reboot for major updates. Linux doesn’t. Let’s say you need to reboot, as a matter of course, six times a year with Windows HPC. With Linux, you don’t.
If you think that doesn’t sound like much, think again. This is HPC, not your PC, and not your ordinary server. Six hours of downtime in a year, all by itself, is a major failure in HPC.
I suppose if you segment the market between high end and low end, then you could say that at the low end 6 reboots a year is too much, particularly if you were running a mission critical app. I suppose you could say that…I wouldn’t. But you could. At the high end, of course, you cannot make any such statement. Big machines go down. A lot. They are complex amalgamations of complex gear, and are usually extremely brittle in the face of any failure in any piece.
To be fair, Vaughan-Nichols’ piece is an opinion piece, and is clearly identified as such. And I know that the point of publishing is to attract readers, and being flamboyant certainly does that (I think the term is “link baiting” — I’ve been accused of doing it myself). It is my opinion that this opinion piece does little more than encourage and reflect the knee-jerk emotional reaction that many in IT and HPC have against anything Windows.
We do not yet know whether Windows HPC Server will be successful in opening the new market it is shooting for, or indeed even whether it will end up being an acceptable operating system at all (Windows ME, anyone?). But this particular opinion piece doesn’t take us further toward a definitive answer to either of those questions. Any and all comers to this market must be encouraged. HPC is too important to the future of our culture to not give all the solutions proposed in the marketplace a serious evaluation.