Microsoft Releases Batch Shipyard on GitHub for Docker on Azure

Microsoft continues to augment it’s Azure cloud for HPC workloads with the release of the Batch Shipyard toolkit, version 1.0.

Available on GitHub as Open Source, the Batch Shipyard toolkit enables easy deployment of batch-style Dockerized workloads to Azure Batch compute pools. Azure Batch enables you to run parallel jobs in the cloud without having to manage the infrastructure. It’s ideal for parametric sweeps, Deep Learning training with NVIDIA GPUs, and simulations using MPI and InfiniBand. Whether you need to run your containerized jobs on a single machine or hundreds or even thousands of machines, Batch Shipyard blends features of Azure Batch — handling complexities of large scale VM deployment and management, high throughput, highly available job scheduling, and auto-scaling to pay only for what you use — with the power of Docker containers for application packaging. Batch Shipyard allows you to harness the deployment consistency and isolation for your batch-style and HPC containerized workloads, and run them at any scale without the need to develop directly to the Azure Batch SDK.

The initial release of Batch Shipyard has the following major features:

  • Automated Docker Host Engine installation tuned for Azure Batch compute nodes
  • Automated deployment of required Docker images to compute nodes
  • Accelerated Docker image deployment at scale to compute pools consisting of a large number of VMs via private peer-to-peer distribution of Docker images among the compute nodes
  • Automated Docker Private Registry instance creation on compute nodes with Docker images backed to Azure Storage if specified
  • Automatic shared data volume support for:
    • Azure File Docker Volume Driver installation and share setup for SMB/CIFS backed to Azure Storage if specified
      GlusterFS distributed network file system installation and setup if specified
    • Seamless integration with Azure Batch job, task and file concepts along with full pass-through of the Azure Batch API to containers executed on compute nodes
    • Support for Azure Batch task dependencies allowing complex processing pipelines and graphs with Docker containers
      Transparent support for GPU accelerated Docker applications on Azure N-Series VM instances (Preview)
    • Support for multi-instance tasks to accommodate Dockerized MPI and multi-node cluster applications on compute pools with automatic job cleanup
    • Transparent assist for running Docker containers utilizing Infiniband/RDMA for MPI on HPC low-latency Azure VM instances (i.e., STANDARD_A8 and STANDARD_A9)
    • Automatic setup of SSH tunneling to Docker Hosts on compute nodes if specified

Sign up for our insideHPC Newsletter