Sign up for our newsletter and get the latest HPC news and analysis.

Working together: communications networks

Your cluster is comprised of many individual computers that, by themselves, cannot work together at all. They have to be connected together in order to work as a team on your problem. This connection is managed by one or more networks in your cluster, and these networks are sometimes called the cluster interconnect.

There are several different kinds of networks that computer makers offer for their clusters, ranging from the familiar Ethernet to the very exotic. The most important factors in determining which kind of cluster interconnect is most appropriate for you are cost and the characteristics of your application.

Cluster interconnects are characterized by latency and bandwidth. The term latency refers to the amount of time it takes a very small message to go from one computer to another in your cluster across the network. Latency is measured in really small units of time, like microseconds or nanoseconds. If your application sends lots of very small messages, your performance will be better if you have a low latency interconnect such as InfiniBand (a popular option today).

Bandwidth refers to how much data that a network can move in a unit of time, and is measured in big numbers like megabytes or gigabytes per second. If your application sends relatively fewer, large messages, then a high bandwidth network may help the performance of your cluster for jobs that you will typically run.

There are other options, but InfiniBand and Ethernet (both the 1 Gigabit per second and 10 Gigabit per second varieties) are the most common in small and mid-sized clusters today. InfiniBand may appear to be more expensive in some cases because its hardware is more expensive than 1 Gbps Ethernet, but you need to study your application’s needs carefully before you make a network decision. It may be that performance of your application is so much better on an InfiniBand-connected cluster that you’ll be able to buy fewer processors and save money overall by buying the more expensive network.

This is all just the hardware it takes to get your cluster put together. But without software to make it work, a cluster is just an expensive paperweight.

Read on for more about cluster software » The glue: cluster software