Sign up for our newsletter and get the latest HPC news and analysis.

Don’t Play MacGyver With Your Data Center Cluster

Dr. Matthijs van Leeuwen, Founder and CEO of Bright Computing

Dr. Matthijs van Leeuwen, Founder and CEO of Bright Computing

Whenever I meet someone in the IT business, I ask them about their data center cluster operations. Often the story I hear is one that would make MacGyver proud. For those who aren’t up on late 80’s TV pop culture, MacGyver is a character famous for getting out of tough situations by coming up with a makeshift solution from whatever bits and pieces he found laying around. That’s no way to manage a critical part of your organization’s infrastructure. Not when there are purpose-built, professional grade solutions available to everyone.

You may think you don’t need a cluster manager to get your job done. Fair enough. But let me give you a few things to consider before you settle on an ad-hoc set of tools that aren’t all that well suited to the job.

Before you can even install your application software, you’ve got to build a cluster for it to run on. This typically consists of a set of rack servers running Linux and interconnected with a high-speed network. Getting that working requires a lot of installation time, creating configuration files, and testing the results. Only then are you ready to install and configure your workload software and other elements of your cluster solution. With a modern cluster manager, all you have to do is boot the installation disc, answer some basic question about the stack you want to run, and sit back while the cluster manager installs and configures everything for you.

Creating configuration files by hand can be tricky. There are lots of ways to make a critical configuration error, either when first creating the file, or when modifying it later on. Cluster managers automate that process, building the configuration files you need automatically. Automatic config file generation = no pilot errors.

Once your cluster is up and running, you’ll need to monitor things to keep it that way. Rather than use a collection of different tools, you’ll be much happier with the consolidated view a cluster manager provides. They give you control over which metrics are sampled, how often they’re sampled, and display critical information in the way you choose.

Since most clusters evolve over time, you will inevitably need to scale up (or down) to meet changes in demand. This can involve a lot of manual work. That is, unless you use a cluster manager. Then the task becomes a simple matter of provisioning new nodes (or decommissioning unnecessary ones).

When you experience spikes in demand, you will want to add capacity quickly. One way to do that is to spin up some servers in the cloud. Without a cluster manager, you can expect to spend a lot of time navigating the dashboard of your favorite cloud service provider to instantiate new servers and get them configured as part of your cluster. Or you can have your cluster manager do that for you — automatically — on demand.

Some workloads benefit greatly from the application of GPUs and other hardware performance accelerators. Cluster managers can drastically reduce the time and effort required to manage GPUs and integrate them into the rest of your cluster management workflow.

So, while it may seem easier to get your project started using whatever tools you have on hand, consider using a cluster manager from the get-go. You won’t be sorry. By taking away the grunt work involved in setting it up, and making it easy to keep tabs on everything without switching from one tool to another, you’ll be free to focus on things that really matter — like seeing how the cluster handles your workloads. Don’t MacGyver your cluster — manage it.

Dr. Matthijs van Leeuwen is the Founder and CEO of Bright Computing.

Resource Links: