A new integrated web solution from Penguin Computing’s Scyld group provides a flexible framework that greatly simplifies the monitoring and management of clusters and lets you integrate your own tools so you can see it all in one interface.
To date the Scyld offering at Penguin has focused primarily on cluster operating system and provisioning management through the Scyld ClusterWare solution, but today they announced a new product for your cluster: the Integrated Management Framework (IMF).
Think of the IMF as a single dashboard for all of your management tools. IMF is a web-based framework with support for IPMI (the Intelligent Platform Management Interface) and the ability to report back on stats that ClusterWare already collects about the systems it’s managing. That’s the out of the box functionality, which by itself is useful but not all that special. What makes IMF special is that it is an extensible framework that allows end users or tools makers to integrate existing management applications for everything from job schedulers to communications fabrics into a single, tabbed interface. As Arend Dittmer, director of HPC products at Penguin points out, “Today administrators have many point solutions for monitoring all the individual components of their clusters. IMF brings all of these together in one, extensible status console.”
Let’s look at an example of the interface to make things a little more clear. In the image of IMF running on Penguin’s 6-node test cluster (click the picture for a larger view) you’ll notice several tabs across the top of the interface. The first provides a summary of everything that’s going on, and you can tweak this to emphasize the things that you in particular are interested in. On the “nodes” tab, which I have selected, you can see node status, and execute various scripts and actions on the nodes including getting at all of the IPMI management features. The Torque, Ganglia, and TaskMaster tabs show you how your management applications will appear in the interface (these tools are pre-packaged with IMF, although there is an API that allows you to glue in your own or other third-party tools).
Who is IMF for? Initially Penguin is aiming it at customers of its own clusters, expanding the ease-of-use of their cluster offering. Those that follow Penguin’s business will know that you can get ClusterWare sans Penguin hardware — the same is true for this new product as well, though that model is not an active part of their business at this time. Tom Coull, the general manager of Penguin, wouldn’t say much more, but he did emphasize the “at this time” part of that comment.
In terms of future capabilities, Penguin has a lot planned. Coull and Dittmer outlined expanded management capabilities, and the ability to store the data IMF collects in its own historical database of system truth. Something else that the company is thinking about is the ability to do what I’ll call “federated monitoring”: have IMF watch for a cluster of conditions in several of the management tools and take some action when a criterion is met. This would add a lot of value over having a human manually do this check in several different monitoring tools to diagnose a trouble state, even if all of those checks are in the IMF’s single interface.
The Scyld Integrated Management Framework is shipping right now, and you can find more information at Penguin’s website.