Choosing the Right Type of Cloud for HPC Lets Scientists Focus on Science

Print Friendly, PDF & Email
IaaS platform

This sponsored post from Naoki Shibata, founder and CEO of XTREME-D, covers how to choose the right type of cloud computing option for your business to increase efficiency.

cloud computing

Naoki Shibata, founder and CEO of XTREME-D

Whenever HPC is mentioned, images of server racks, networking, system administration, package installation, and text terminals tend to come to mind. These “features” have long been the hallmark of high end parallel computing. But the growth of scalable Data Analytics (i.e., Apache Hadoop/Spark) and Deep Learning (i.e., Tensorflow) have widened the playing field for typical uses of cluster computing.

One challenge that HPC, DA, and DL end users face is to keep focused on their science and engineering and not get bogged down with system administration and platform details when ensuring that they have the clusters they need for their work. It has often been said that if scalable cluster computing can become more turnkey and user-friendly (and less costly), then the market will expand to many new areas.

Keeping Users Out of the Cluster Management Business

Keeping the end user closer to the application, with less wasted time, should be a priority in the HPC, DA, and DL spaces. But how do individual researchers and scientists do that? Just because it’s small-scale HPC doesn’t mean it isn’t time-consuming to configure the cluster that powers the research.

Cloud computing is one obvious option for accessing clustered computing, though it brings with it its own challenges. The on-demand nature of cloud offers an attractive “pay for only what you use” option. But with the exception of quick instance spin-up, building clusters by hand in the cloud does not necessarily make cloud-based clusters any cheaper to use and maintain than on premises systems.

A typical researcher or scientist needs access to HPC, DA, or DL applications, but doesn’t always have the higher-level knowledge (or the bandwidth required) to configure a cluster to run them. Cloud computing offers a good point of entry as it eliminates the need to set up actual hardware, but cloud vendors typically only go so far in terms of resource capabilities (i.e., getting your applications to run is still your job). Users and administrators are still tasked with the problems of final configuration and handling support issues.

So what to do? Is there a cloud solution that lets scientists focus on science? Let’s look at the three main types of cloud services: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). Each one has its own advantages and is suitable for certain purposes, so it’s important to make the right choice for your specific circumstances.

SaaS Can Be Very Useful if it Meets Your Needs

Software-as-a-Service (SaaS) is perhaps the most well-known and commonly utilized type of cloud for business. SaaS vendors provide a wide variety of third-party applications, which are delivered to users directly and do not require downloads and installation.

The greatest advantage of SaaS is that SaaS vendors provide one-stop services to their users, which greatly reduces the time and cost required to configure and use more traditional system environments. On the other hand, however, SaaS offers little flexibility and thus sometimes requires users to change their business process to fit the application(s). Troubleshooting is also difficult because the application is running on a server that is managed by the SaaS vendor.

PaaS Requires Cloud Computing Skill

Platform-as-a-Service is similar to SaaS except that it does not provide the software required for the application (the application code and related libraries etc.), only a cloud platform (hardware and system software) on which users can build their own customized application environment. This set-up lets you concentrate on your applications without worrying about infrastructure, but requires you to have cloud computing skills, as your users will work on applications within the cloud. Therefore, you will need to optimize those applications to meet PaaS infrastructure requirements.

IaaS Requires Cloud Infrastructure Skill

Infrastructure-as-a-Service provides primitive cloud resources. You can build your HPC clusters however you like, with the advantage of tailoring the environment optimally for your needs. But you would need to learn about and choose the specific resources required for your job, as well as how to deploy, install, and configure them. You are given “bare metal” and need to add all of the software layers – the system and applications required for running your job. And, you would need to know how to tune any device accessed via public cloud.

One challenge that HPC, DA, and DL end users face is to keep focused on their science and engineering and not get bogged down with system administration and platform details when ensuring that they have the clusters they need for their work.

This is not an easy job, and often there are pitfalls. Therefore, building and operating an HPC cluster takes a lot of working hours, during which you’d still have to pay for your cloud subscription even though you aren’t able to run your jobs until your cluster is properly configured.

More Control than SaaS and Easier Than PaaS or IaaS

After reviewing the specific features and functionalities of each type of cloud service, we came up with an offering that incorporates the best features of SaaS and IaaS — the application readiness of SaaS combined with the flexibility and control over the entire infrastructure offered by IaaS. We developed XTREME-DNA specifically to keep the scientist or the researcher out of the cluster management business.

Using a web-based interface, you can build HPC/DA/DL clusters in both private and public clouds (i.e., Azure and AWS) in as little as 10 minutes, all configured and ready to use, while having complete control over them. You can also access HPC clusters directly, which offers you the same usability as an on-premise environment. XTREME-DNA eliminates the need for expensive administrators and HPC architects and instead allows end users to manage the entire spin-up, configure, execute, and spin-down with just a few mouse clicks.

Looking at PaaS for Future Features

Recently, one of our customers came to us with a challenge that we are working to solve. Their workloads are not tightly parallel, but are instead stateless, time flexible, and fault tolerant. We thought their workloads might be more suitable for PaaS. PaaS requires their workloads to be optimized, but provides more benefits, such as workflow automation, cost savings, and flexibility. PaaS is an ideal option when a cloud provider provides low-cost virtual machines, because our products can provide the missing infrastructure, therefore saving the customer time and money.

Based on this input, we have incorporated PaaS features into our new IaaS platform, XTREME-Stargate. Through both XTREME-DNA and XTREME-Stargate our products offer HPC/DA/DL point-and-click capability. Users can actually design, spin up, compute, and spin down large leading-edge clusters — local, shared, cloud, or bare metal — with just a few mouse clicks.

We hope this article provides some insight into which type of cloud might be best for you. Regardless of your choice, the XTREME-D approach provides an ultra simple and intuitive template-based web GUI for cloud-based cluster computing that eliminates virtually all infrastructure/application design, configuration, and support issues.

When the user is elevated above the minutiae of cluster computing, the cost/time to solution is reduced through a pay-as-you-go cloud model, price-to-performance budget options, reduced administration, and a dramatic increase in user efficiency.

Learn more in our new white paper, “Point and Click HPC: The XTREME-Stargate IaaS Platform.”

Naoki Shibata is founder and CEO of XTREME-D, a company with a mission to make HPC cloud computing access easy, fast, efficient and economical for every customer. Naoki has over 15 years of experience as an HPC Cloud Architect at HP, Microsoft, IBM and Cray, where he focused on massive scale supercomputing for academia and government.