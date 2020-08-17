Introduction
High-performance computing (HPC) uses a significant amount of computing power and parallel processing techniques for solving complex computational and advanced problems and performing research activities through computer modeling, simulation, and analysis. Machine Learning (ML), Deep Learning (DL), and Artificial Intelligence (AI) use algorithms and techniques that iteratively learn from data to allow computers to find hidden insights without being explicitly programmed where to look. Generally speaking, HPC and AI have individual system and separate environments. However, from many years of experience from working with numerous customers, Quanta Cloud Technology (QCT) found that converged HPC and AI environments can bring benefits for customers and remain flexible enough to meet their workload demands.
Converged HPC and AI environment
Based on the experience with AI and HPC workloads, QCT designs balanced HPC and AI environments with a flexible infrastructure for customers who may run different workloads. QCT has advanced hardware and software stacks that have a synergistic effect of the HPC and AI converged environment. However, while HPC focuses on more compute-intensive workflows, AI must handle more data-intensive workflows. These workflows require different environment settings, and QCT finds mutual benefits and brings them together to resolve customers’ complex issues. For example, QCT improves AI training through HPC paralleled architectures. Based on the experience with customers and continuous improvement, QCT developed the concept of the QCT Platform on Demand (POD), which is ideal for running diverse workloads on one system infrastructure, as well as optimized QCT management and monitoring systems. Below is a diagram of how QCT works with customers to continuously deliver QCT solutions, QCT starts with understanding customer’s workload and then offers corresponding hardware and software infrastructure. Through tuning system and optimize workload, QCT aims to build a workload-driven design environment for customers.
QCT Platform on Demand (QCT POD)
A proven approach to getting customers HPC and AI systems productive quickly is to deliver all the components together, integrated, and tested. The components should be selected based on the known and anticipated workloads and optimized for these requirements.
QCT Platform on Demand (QCT POD) is a concept of an on-premises, workload driven integrated design system. Under this concept, QCT will provide various solutions with optimized hardware and software integration for specific workloads. The QCT POD system architecture includes a set of diverse common building blocks, which have greater flexibility and scalability to meets different customer’s demands. The QCT POD also provides management tools to simplify the deployment and management process. With the pre-validated and pre-configured rack-level system design, QCT POD solutions also include suitable power supplies, and network cabling to speed up time to implementation for customers.
QCT POD Building Blocks
QCT POD is delivered with all of the building blocks installed, configured, and ready to go. The hardware components are all delivered connected with a high-speed network fabric. The components are all best in class and have been selected for the highest performing workload requirements. QCT uses Intel® Next Generation Server technologies that include the Intel® Xeon Scalable Processors, which are recognized as the highest performing CPUs today, especially for HPC and AI workloads. Intel® Xeon Platinum 8280 can deliver excellent performance on HPC and AI benchmark on Quanta servers. Looking forward, when the Intel® Optane persistent memory is bundled and the Intel® compiler is implemented, many HPC workloads will get a performance boost with larger amounts of memory.
The software stack is implemented with open source software and specific pre-validated commercial software, selected for maximum performance, reliability, and flexibility. With various components and building blocks, the QCT POD is designed for flexibility and can be modified based on the customer’s workloads. Different CPUs or GPUs can be integrated, various storage options can be selected, and the network fabric can be changed as well. Below is a diagram of the components that can be integrated as part of the QCT POD offering.
Overall, QCT POD delivers excellent performance for a variety of workloads with the flexible building blocks design and the engineering domain know-how. Below is the diagram of the hardware design that leading customers to realize continuous success.
