This is the second article in a series taken from the insideHPC Guide to Flexible HPC.
Organizations that implement high-performance computing (HPC) technologies have a wide range of requirements. From small manufacturing suppliers to national research institutions, using significant computing technologies is critical to creating innovative products and leading-edge research. No two HPC installations are the same.
For maximum return, budget, software requirements, performance and customization all must be considered before installing and operating a successful environment.
Requirements in HPC Environments
Organizations that require HPC technologies will have a wide range of requirements, depending on their workloads. While some applications can run hundreds or even thousands of cores, other applications cannot take advantage of more than one core.
Various programming techniques can be used for applications to take advantage of more than one core. Within a server that is running a single OS, OpenMP directives can be used to distribute different threads to different cores. However, when an application requires more cores than are available in a single server, the Message Passing Interface (MPI) API is used, and can distribute and keep track of various parts of the workloads on different servers. Then, within the server, either OpenMP or the MPI API can be used.
Single-socket systems can be used for applications that are not easily able to run and take advantage of multi-core systems. Dual-socket systems are typically the most popular because the core count is reasonable for many applications (up to 44 cores per system), but there still aren’t too many. Four-socket systems are ideal for applications that can scale significantly yet still require a single OS installation. Eight-socket systems are available for the most demanding applications, but do come with a cost overhead. One-, two-, four- and eight-socket systems basically perform the same tasks, yet in a more parallel environment the amount of memory available using similar memory dual in-line memory modules (DIMMS) can push the decision to a larger system. With a larger number of sockets per CPU board, more memory slots will be available, thus increasing the size of the data that an application can address.
Many HPC installations will need a variety of servers. With intelligent resource management software, a combination of one-, two-, four- and eight-socket systems can be put to maximum use. In a large environment where multiple jobs are running simultaneously, some low in parallelism and some high in parallelism will be best-suited to installing a variety of server capabilities.
Recently, a new class of computing hardware has become available that can greatly accelerate certain applications. While main CPUs now top out (as of May 2016) at 22 cores per socket, accelerators can contain more than 100 cores per system. The Intel ®Xeon Phi ™coprocessor is an example of this new type of hardware, where applications that are highly parallel and contain algorithms that can be spread over hundreds of threads can take advantage of. While not typical, some benchmarks show more than 100 times improvement in the performance of an application when using an accelerator. For very compute-intensive applications, it may be critical to house more than one of these accelerators in a single system. This would give an application access to between 200 and 300 computing elements in a single enclosure and a single operating system. A tight coupling of the CPUs with the coprocessor can enable significant acceleration of certain classes of applications.
In coming weeks, this series of articles will explore:
- Challenges in HPC Environments
- Requirements in HPC Environments (this article)
- TYAN® Solutions
- Successful Implementations