CIQ Demos Preview of ‘Fuzzball Federate’ to Streamline HPC Workflows

ATLANTA — November 18, 2024 — SC24 — As high performance computing teams move more workloads to cloud-native environments to take advantage of scale and agility, orchestration of jobs across on-premises and cloud environments slows progress. CIQ Fuzzball Federate, unveiled today in early access at SC24, addresses these challenges and enables researchers to use a comprehensive management platform to define and execute important HPC, artificial intelligence (AI) and machine learning (ML) workloads across disparate and disconnected resources.

For example, with Fuzzball Federate, researchers can now define and deliver workloads to unified systems and then connect them with AWS clusters to help scale them in the cloud without modification. Conversely, workloads can be prototyped in the cloud before the organization commits to a capital expenditure for local, production resources. Either way, Fuzzball ensures workflows execute repeatably, reliably and performantly, regardless of the underlying infrastructure.

“With Fuzzball, researchers no longer need a Ph.D. in infrastructure to manage their complex HPC and AI/ML workloads in hybrid environments,” said Gregory Kurtzer, founder and CEO of CIQ. “Fuzzball Federate is the third leg of the stool within the Fuzzball ecosystem, furthering our goal of delivering the most comprehensive and complete performance computing platform for research institutions and enterprises alike. We’re excited to provide the first glimpses of Fuzzball Federate at SC24 in Atlanta, and we invite all attendees to come take a look and give us your feedback.”

CIQ’s Fuzzball, first released in August 2023, is a modern, performance-intense compute platform that simplifies the creation and deployment of complex HPC and AI/ML workloads. Running on top of Kubernetes, it is API based and provides an easy-to-use graphical interface to automate the provisioning and management of the necessary infrastructure to run these jobs.

The infrastructure management layer of individual Fuzzball clusters has two main components: Fuzzball Substrate, which delivers a custom container runtime and resource manager, and Fuzzball Orchestrate, which manages and schedules complex, multi-step workloads and data ingress and egress.

Today at SC24, CIQ is unveiling the third component: Fuzzball Federate. It works with Substrate and Orchestrate to unify and provide seamless access and management of compute resources across on-prem clusters and cloud computing regions.

In a federated Fuzzball environment, users define and submit workflows with the same web user interface and command-line interface they would use in a single Orchestrate deployment. However, where workflows submitted directly to an Orchestrate cluster may run only on the resources available to that single cluster, workflows submitted to a Federate cluster may run on any of the Orchestrate clusters joined to the federation. These Orchestrate clusters may be dynamically provisioned cloud resources (e.g., running compute jobs on AWS EC2) or local, on-prem compute clusters.

Federate evaluates the CPU, memory, accelerator and storage requirements of the workflow against the resources available in each attached Orchestrate cluster and dispatches the workflow to an appropriate cluster for execution. The Orchestrate cluster then provisions the necessary resources (in cloud environments) and dispatches individual compute jobs via Substrate.

Single-cluster deployments of Orchestrate are still supported, and an existing Orchestrate deployment can be joined with additional deployments in a federation at any time.

More about Fuzzball Federate: https://ciq.com/blog/fuzzball-federate-unify-complex-hpc-and-ai-ml-jobs-across-cloud-and-on-prem-resources/ ***