Practical Hardware Design Strategies for Modern HPC Workloads

Print Friendly, PDF & Email

This special research report sponsored by Tyan discusses practical hardware design strategies for modern HPC workloads. As hardware continued to develop, technologies like multi-core, GPU, NVMe, and others have allowed new application areas to become possible. These application areas include accelerator assisted HPC, GPU based Deep learning, and Big Data Analytics systems. Unfortunately, implementing a general purpose balanced system solution is not possible for these applications. To achieve the best price-to-performance in each of these application verticals, attention to hardware features and design is most important.

Many new technologies used in High Performance Computing (HPC) have allowed new application areas to become possible. Advances like multi-core, GPU, NVMe, and others have created application verticals that include accelerator assisted HPC, GPU based Deep Learning, Fast storage and parallel file systems, and Big Data Analytics systems.

This technology guide, insideHPC Special Research Report: Practical Hardware Design Strategies for Modern HPC Workloads, shows how to get your results faster by partnering with Tyan.

Executive Summary

Many new technologies used in High Performance Computing (HPC) have allowed new application areas to become possible. Advances like multi-core, GPU, NVMe, and others have created application verticals that  include accelerator assisted HPC, GPU based Deep Learning, Fast storage and parallel file systems, and Big  Data Analytics systems.

The verticals can be broken into three general design types:

  1. Accelerated HPC Computation – includes both traditional HPC and Deep Learning systems
  2. IO-Heavy HPC Computing – includes systems that provide fast NVMe implementations for local IO or as part of a parallel file system
  3. Big Data (Database) Computing – includes system designed for high density bulk storage of large amounts of data

Various design goals and a discussion of balanced vs. centralized PCIe topology are explained.

IO-Heavy applications should consider solid state U.2 connected NVMe devices that provide up to 4GB/s of throughput. An excellent starting point for an IO-Heavy computing systems is the TYAN Thunder SX GT62H-B7106 platform.

Big Data (and database) computing requires both high performance and bulk storage using 3.5 inch spinning disk drives. The TYAN Thunder SX GT93-B7106 chassis provides a solid platform to create or grow a Big Data computing systems.

In terms of accelerated HPC computing, the TYAN Thunder HX FT83-B7119 is a 10-GPU supercomputing  system in a compact 4U rack-mount chassis. Depending on application needs, the Thunder HX FT83-B7119 can handle both HPC and Deep Leaning applications and is available in four versions based on PCIe bus  routing topology and storage options.

Introduction and Background

While High Performance Computing (HPC) hardware and software have become much more turn-key than in the past, the choice of hardware is still important for optimal performance. Traditional clustered HPC systems  have been built from off-the-shelf servers using x86 processors and high speed networks. End users often  contributed to designs based on their application needs. For instance, systems designed on a fixed budget  had to strike a balance between numbers of processors (cores), memory, and quality of the interconnect. A high performance interconnect, such as InfiniBand was more expensive than traditional Ethernet (usually 10  GbE) and thus inclusion reduced the number of servers in the cluster (or the cores and/or amounts of  memory per server).

Often times, a balanced system was designed that would provide reasonable performance across the  spectrum of user applications. This approach generally worked and many applications were successfully  deployed on these systems. As hardware continued to develop, technologies like multi-core, GPU, NVMe, and  others have allowed new application areas to become possible. These application areas include  accelerator assisted HPC, GPU based Deep learning, and Big Data Analytics systems. Unfortunately,  implementing a general purpose balanced system solution is not possible for these applications. To achieve  the best price-to-performance in each of these application verticals, attention to hardware features and  design is most important.

Over the next few weeks we will explore these topics surrounding practical hardware design strategies for modern HPC workloads and how you can get your results faster by partnering with Tyan:

  • Executive Summary, Introduction and Background
  • Differentiation in Modern HPC Workloads
  • Working Design Strategies, Conclusion

Download the complete insideHPC Special Research Report: Practical Hardware Design Strategies for Modern HPC Workloads,, courtesy of Tyan.