Adaptive Computing, the company that until earlier this year was known as Cluster Resources, was born in HPC. Founder, CEO, and CTO David Jackson worked in real HPC centers (places like SDSC, LLNL, NCSA, and the Maui High Performance Computing Center) where he developed the scheduling software that turned into the Maui Scheduler and then forked into a business. In 2009 Cluster Resources changed its name to reflect a shift in revenue strategy beyond HPC into the enterprise, but that change didn’t take Adaptive Computing out of HPC. I spent some time talking with Dave Jackson a couple weeks ago about the trends he’s seeing in the enterprise, and what they could mean for HPC.
From Moab Cluster Suite to the Moab Adaptive Computing Suite, Jackson views all of Adaptive Computing’s products as enabling better service to the users. “LoadLeveler, PBS, NQS, and the others were always focused just on scheduling,” he says. “But our focus has been on intelligent decision making. Our job is not to get throughput, it is to make the site successful with its users.” Jackson’s vision for large-scale resource management is all about service, and integrating knowledge of — and actions on — the power, cooling, network, storage, and compute infrastructure to create a totally responsive computational environment tailored to the needs of a specific job.
Scheduling has come a long way from a couple hundred lines of code used to keep machines busy at night after the admins went home. Reservation — in both time and space — is still a key concept, but today’s resource management needs also include adapting the environment with the workload to deliver on Service Level Agreements for the centers’ hosting resources, and management software must be prepared to work in a peered environment with one or more other entities that are also scheduling, directing, and reconfiguring resources. It’s a complex job.
Scheduling has also come along way from simply managing the “processor” resource. Adaptive Computing’s suite of tools recognize that resources can range from the typical static to the dynamic and from the traditional (compute) to the austere. Resources may not necessarily be physical entities that utilize power and floorspace: they may be hierarchical guidelines that are subsequently utilized to task mixtures of physical resources below.
Adaptive Computing’s enterprise customers are driven to increase service — often against fixed service level targets for job turnaround — while minimizing costs. Areas in which they are helping their customers minimize costs on the enterprise side is in the provision of storage and networking resources. “It is simply too expensive to over-provision resources to prevent contention,” Jackson says. Although they are seeing a growth in their enterprise business, their software is still very much a part of high end HPC. Happily for our community, Adaptive Computing is in a position to cross-pollinate ideas for managing large scale resources between the enterprise and HPC communities. Moab Adaptive HPC Suite and Moab Cluster Suite are used on 12 of the top 20 systems in the Top500, including both Roadrunner and Jaguar, the current number one and two systems on that list.
As an example of how the company integrates the operational requirements of a specific workload with knowledge of the computational environment, consider that the Moab Adaptive Computing Suite can monitor availability of network and storage resources, and migrate work to areas of the datacenter with excess capacity. Doing this means being able to anticipate before run time what network and storage resources a specific job will require. User estimates might be available, but they are notoriously inaccurate. Adaptive Computing’s software monitors jobs over time as they execute, developing a catalogue of resource requirements that it uses to make better decisions the next time the same job is submitted. It might then use this information to stage a job on a pool of resources connected to particularly fast storage, for example, or provision a dedicated VLAN for a certain job to improve performance in execution.
Power and cooling are other quantities that the resource management suite should be aware of, either scheduling (in the case of new jobs) or actively migrating (with the support of virtualization) portions of the workload to parts of the datacenter that are cooler or have more power available. Adaptive Computing’s software today will dynamically provision the operating system with the application, and can also set up virtual machines (and Linux containers, and so on) to make sure the compute environment is suited to the task. And with the turbo feature on Nehalems, one can envision putting a job into an area of the datacenter with excess power while dynamically changing how the processor satisfies the workload. Power draw, cooling load, and other environmental quantities can also be tracked as part of the resource set that Adaptive Computing’s software watches over time.
Today this information is gathered passively: when a user submits a job, Moab catalogues all of the information about it that it has available, and will use that data to make scheduling and placement decisions in the future. A more active learning approach is planned for the future in which idle cycles could be used to proactively fill out the parameter space to generate a more complete understanding of the operational environment of a workload.
All of this is an integral part of managing enterprise workloads in working datacenters, and Jackson says that it is coming to HPC. In fact, as we talk to HPC’s biggest thinkers about the shift in computing management inherent in the move to exascale, we hear a lot of the same topics mentioned. The key to good performance at this scale is going to be enabling users to match their compute environment — whether it’s a particular kind of accelerator or processor, or the use of a flash array — to the needs of the job in a way that delivers acceptable performance within a specified power and cooling envelope. Datacenter managers might even choose to delay certain kinds of work until off-peak hours to minimize costs, or dynamically adjust the number of processors turned on in a cluster based on the priority of work sitting in the queues. All of this requires deep integration among the formerly very separate systems that make up a datacenter.
Jackson says he envisions pushing the limits of his integrated management approach even further, adding active control over datacenter power distribution units and chillers to the current capability to passively monitor those items, and in so doing bring total control of the compute environment — the “bits to buildings” that Horst Simon talks about — to a single management interface. This will be a long time coming to broad use: datacenters are still retrofitting for sensors, and it will take quite a bit of time and expense to move all the way to centralized, active control networks for the power and cooling distribution systems. But in the enterprise, where the motivation is the bottom line, and in apex computing centers, these capabilities will show up just as soon as they provide benefits people are willing to pay for. “We are totally focused on manipulating the workload and the total computational environment to enable our customers (the datacenters) to deliver better service levels to users.”
This kind of integrated approach to workload management isn’t widely practiced in HPC today because we have typically only had to manage a single constraint — turnaround time. As our community starts to look for prior art on which to base its exascale dreams, however, it seems that much of the toolset that Adaptive Computing has already built out will be of interest.