Note: If you aren’t already familiar with what HPC is or how some businesses are using it today, you might find the articles in Basic Training helpful.
So you understand the basics of what HPC is, where it might fit in your business, and what goes into a cluster. Now, what next?
Benchmarking your application
We’ve tried to emphasize that the kind of solution you buy needs to be driven by the problem you are trying to solve. To do this, you’ll need to identify the application you need to solve your problem.
Once you’ve found the application that does what you need (this might be a really easy process if you’re already using a application for this kind of job and the vendor supports running the application on more than one processor at once), you’ll need to benchmark it in order to understand how it performs. Benchmarking is the process of testing out your application running the kind of problem that you are interested in solving — or one that closely approximates it — on the kind of hardware that you intend to buy. The performance of high performance computers, even small ones, is determined by the performance of all of the components working together. It’s virtually impossible to predict from looking at datasheets, vendor brochures, and sales presentations how a cluster will work on an application without testing it first. Sure an InfiniBand network is theoretically faster than gigabit Ethernet, but if your application isn’t communication bound then the extra money that InfiniBand costs might be better spent on memory.
Your HPC vendor or partner will be able to guide you through this process; just be sure you are satisfied that you have some real idea of how your application is going to perform before you buy.
Of course, this assumes that the application you’ll need is one you already have, or is one you can buy off the shelf. If not, then you’ll be developing (or paying someone to develop) an application that suits your needs. This can be a complex process that needs a lot of special skills on a parallel machine — if you aren’t already experienced in doing this, be prepared for a significant challenge. In any case, your development team can help you get a handle on the hardware that you’ll need to run your application.
You can find more about how the various elements of a cluster (disk, processor, network, etc.) influence the performance of a cluster in HPC 101.
There is more to making it work
You are considering buying a high performance computer, so the performance of your application is clearly a key consideration, which is why we started with an emphasis on application benchmarking. But there are other factors you’ll want to include in your thinking.
For example, you may have a choice of operating systems that support your application: both Windows HPC Server and Linux are viable cluster operating systems, and there are others as well. The one you pick may impact the performance of your application, but it may also be driven by the IT environment in which you are going to put your cluster. Are you a Windows-only shop with no desire to (re)train an administrator for Linux? Then a cluster that supports Windows will be a factor in your decision making. Are familiarity with the cluster networking technology and cost more important than absolute performance on your application? Then running a standard (TCP/IP-based) cluster interconnect with gigabit Ethernet is something you’ll want to strongly consider.
The point is that you are buying a cluster as a tool to support your business. In some way, it has to make money for you, or you shouldn’t be buying it. Considering how your cluster will fit in with your current computer infrastructure and staff expertise (and the costs of doing something different, including training) must be a part of your process.