In this special guest post, Peter ffoulkes writes that automation software from companies like XTREME-D are making HPC datacenters more productive every day.
It is tempting to think that high performance computing (HPC) is the exclusive province of massive installations at national laboratories, government agencies, leading academic institutions, and the world’s major corporations. While this is still largely accurate for the type of systems found on the TOP500 list, they take years to plan, procure, install and be ready for running real workloads. In addition, they are massively expensive to acquire and administer, putting them well beyond the reach of organizations with small budgets or smaller resource requirements.
In addition to classic HPC workloads, a host of new application areas coming into play include data analytics, machine learning, deep learning, and artificial intelligence. Some of these can require a diverse range of resources, including fast memory systems such as non-volatile memory express (NVMe), low-latency networks such as InfiniBand or Intel Omni-Path, math accelerators such as graphical processing units (GPUs), and field-programmable gate arrays (FPGAs).
These new techniques and technologies offer great opportunities to scientists and researchers in the expanding universe of HPC workloads, but they each come with a price. That price comes in the form of increasing complexity and cost, which is a significant barrier to successful science or research. Whether it is time to publication, or time to product/result, or the amount of resource access, the old adage ‘Time is Money’ holds true; limited access to required resources lengthens time to results, and the complexity of configuring and administering the environment to run applicable workloads has many negative implications.
These circumstances, together with advances in the technologies, have led to increasing interest in and adoption of cloud services for HPC workloads for many users. At the ISC HPC conference in June this year Intersect360 Research offered a very positive assessment, noting that cloud spending by HPC customers grew by 44 percent from 2016 to 2017, and predicting HPC cloud-based spending at around $1.1 billion for 2017, reaching nearly $3 billion by 2022. It is clear that the market opportunity and demand is there, which will make HPC resources available to many more users. That said, many significant issues still remain to be addressed.
It has long been the case that HPC system users frequently needed to be computer scientists in addition to their primary disciplines in order to be able to match the characteristics of their workloads to suitable resources, to administer their systems, and to schedule the placement of workloads in a cluster. The advent of HPC-capable cloud services adds more choice and complexity to the situation, putting even more pressure on users and increasing the need for sophisticated automation tools to manage the configuration and administration of cloud resources, whether private, public, or hybrid.
When combined with workload templates, such automation software can greatly simplify the selection and configuration of suitable ‘pay as you go’ HPC cloud resources, dramatically reducing the time required to spin up and spin down a virtual cluster and eliminating the majority of cluster administration costs. Additional benefits include enhanced security and significantly improved user productivity.
One such example of these capabilities comes from startup XTREME-D, which provides a simple point-and-click user interface based on HPC workload templates and the XTREME-Stargate cluster portal, which acts as a “super head node” to provide simple access to virtual cluster resources. The company claims that users can go from nothing to running workloads in just ten minutes.
The ability to easily offer on-demand HPC as a Service promises to revolutionize how a new generation of users can rapidly access the resources they need at much lower costs, which will in turn increase the pace of scientific research.
Peter ffoulkes has over 25 years of diversified international experience in enterprise and high performance computing, including positions in management, business and product strategy development, product marketing, sale,s and training with leading international information technology companies.
Sign up for our insideHPC Newsletter