In this guest post, Katie Rivera of One Stop Systems explores how rack-scale composable infrastructure can be utilized for mixed workload data centers.
Since 2012, hyperconverged servers, with large numbers of CPU, GPU, storage and networking resources, have formed the foundation of GPU accelerated data center architecture for the most demanding applications. These highly integrated, and often expensive, servers operate efficiently when used in a cluster with very specific applications in mind at the time of purchase. What if the application can now scale to higher number of GPUs or if the datacenter runs many different HPC applications? Over time, as applications change and need more data storage, GPU compute power, networking or CPU cores the datacenter manager changes the server to meet the new needs. Due to power density of modern GPUs, NVMe drives and the latest generation CPUs, adding a few GPUs inside a server with more storage can no longer provide the necessary additional compute power for applications such as AI, deep learning, RTM, Monte Carlo and image processing. A more flexible, application-centric, datacenter architecture is required that can meet the needs of the rapidly changing applications and hardware.
The newest solution involves disaggregating the server resources into a “composable HPC infrastructure.” With composable infrastructure, the datacenter manager combines many existing servers with expansion systems such as an NVMe flash storage array (JBOF) or GPU accelerator systems to add greater numbers of resources than the servers can support in a hyperconverged architecture. In addition, the job scheduler works with the composable infrastructure API to create the ideal node or cluster to run a particular application set. Composable infrastructure allows any number of CPU nodes to dynamically map the optimum number of GPU and NVMe storage resources to each node required to complete a specific task.
A more flexible, application-centric, datacenter architecture is required that can meet the needs of the rapidly changing applications and hardware.
When the task completes, the resources return to the cluster pool so they can be mapped to the next set of nodes to run the next task. Expansion also brings additional advantages to the datacenter such as bandwidth aggregation of many PCIe 3.0 based GPUs or NVMe drives into the latest generation PCIe 4.0 based servers. For data centers with many nodes, the expansion option adds unlimited flexibility to the HPC architecture by decoupling the latest innovations in CPU capabilities, GPU performance and NVMe storage.
[clickToTweet tweet=”Katie Rivera – Composable infrastructure using expansion accelerators provides many benefits. #datacenters #hpc” quote=”Katie Rivera – Composable infrastructure using expansion accelerators provides many benefits. #datacenters #hpc”]
Composable infrastructure using expansion accelerators provides many benefits. HPC data scientists can use servers they already own and add GPU expansion and NVMe storage via expansion with no additional server investment. If they do plan to purchase new servers, data scientists should choose the best server for the application, no matter how many GPUs fit inside. Since servers, GPUs and storage upgrade on different schedules from the various vendors, composable infrastructure can be upgraded at different times spreading the capital expenditures over many fiscal periods. Better yet, data scientists can rent the latest technology composable infrastructure systems and software from Cloud Service Providers using operational expenditure budgets rather than capital equipment budgets. Other benefits of composable infrastructure using expansion systems include large number of GPUs on the same RDMA network fabric, especially for AI, deep learning, RTM, Monte Carlo and image processing applications that benefit from peer-to-peer communication with moderate CPU interaction.
Last week, OSS unveiled the newest version of its rack-scale GPU Accelerator products, the GPUltima-CI (Composable Infrastructure) at the NVIDIA GPU Technology Conference (GTC 2018). GPUltima-CI allows mixed use datacenters to greatly increase GPU, networking and storage resource utilization compared to similar hyperconverged server solutions. Unlike traditional architecture where applications must use the available datacenter hardware, OSS GPUltima-CI allows the high-performance application, via the Liqid Command Center API, to dictate the optimal bare-metal hardware configuration for each job to maximize efficiency. Large, flexible reservoirs of GPUs, NVMe storage and NICs are interconnected by a high-speed, low-latency PCIe switched fabric to banks of dual Intel® Xeon™ Scalable Architecture server nodes in each rack. The Command Center Management Software then composes these resources into the optimal set of bare metal servers. This multi-petaflop compute accelerator system is perfect for AI training, deep learning, weather modeling, finance simulations, and data sciences requiring flexible access to GPUs and storage resources.
The GPUltima-CI power optimized rack configuration features up to 32 dual Intel Xeon Scalable Architecture compute nodes, 64 network adapters, 48 NVIDIA Volta GPUs and 32 NVMe drives on a 128Gb PCIe switched fabric allowing for a large number of composable server configurations per rack. Using one or many racks, the OSS solution contains the necessary resources to compose a wide variety of combinations of GPU, NIC and Storage resources required in today’s mixed workload data center.
Katie Rivera is marketing communications manager at One Stop Systems.