“The major functionality of the Intel Xeon Phi coprocessor is a chip that does the heavy computation. The current version utilizes up to 16 channels of GDDR5 memory. An interesting notes is that up to 32 memory devices can be used, by using both sides of the motherboard to hold the memory. This doubles the effective memory availability as compared to more conventional designs.”
With the release of high wattage processors liquid cooling is becoming a necessity for HPC data centers. Liquid cooling’s ability to provide the direct removal of heat from these high wattage components within the servers is well established. However, there are sometimes concerns from facilities management that need to be addressed prior to liquid cooling’s introduction to the data center.
“High performance systems now typically a host processor and a coprocessor. The role of the coprocessor is to provide the developer and the user the ability to significantly speed up simulations if the algorithm that is used can run with a high degree of parallelization and can take advantage of an SIMD architecture. The Intel Xeon Phi coprocessor is an example of a coprocessor that is used in many HPC systems today.”
Advancements in video technology have slowly pushed applications like video editing, video rendering and video storage editing into the High Performance Computing world. There are many different video editing programs that can cut, trim, re-sequence, and add sound, transitions and special effects to video. But with the introduction of 4K/8K video, a simple laptop isn’t powerful enough on its own anymore, especially for online editing.
The ability to develop applications independent of the hardware availability at run time is a very important concept that enables developers to take advantage of the latest and greatest processing and coprocessing power. Without having to make run time checks on hardware availability is critical to a smooth running HPC environment.
“Native execution is good for application that are performing operations that map to parallelism either in threads or vectors. However, running natively on the coprocessor is not ideal when the application must do a lot of I/O or runs large parts of the application in a serial mode. Offloading has its own issues. Asynchronous allocation, copies, and the deallocation of data can be performed but it complex. Another challenge with offloading is that it requires memory blocking. Overall, it is important to understand the application, the workflow within the application and how to use the Intel Xeon Phi coprocessor most effectively.”
For decades, Intel has been enabling insight and discovery through its technologies and contributions to parallel computing and High Performance Computing (HPC). Central to the company’s most recent work in HPC is a new design philosophy for clusters and supercomputers called Intel® Scalable System Framework (Intel® SSF), an approach designed to enable sustained, balanced performance as the community pushes towards the Exascale era.
Intel® Cilk™ Plus is an extension to C and C++ that offers a quick and easy way to harness the power of both multicore and vector processing. The three Intel Cilk Plus keywords provide a simple yet surprisingly powerful model for parallel programming, while runtime and template libraries offer a well-tuned environment for building parallel applications.
“Tasks keep the CPUs busy. When a core is working, rather than waiting for work to be sent to it, the application progresses towards it conclusion. A caveat to all of this is to remember that tasking and threading models remain on the system it was created on. Tasks that use a shared memory space only work within the shared memory segment that the processing cores can get to. Shared memory on the CPU side of the system is separate from the shared memory on the coprocessor. The threads created will remain on the part of the system where it started.”
Sandia National Laboratories has already seen the benefits from a major Asetek liquid cooled HPC system that has been in use for over twelve months. The 600 teraflop Sky Bridge Supercomputer with 1,848 nodes was installed using Asetek D2C in a Cray CS300-LC supercomputer cluster. With RackCDU D2C, air heat-load was cut by more than 70%, making mechanical upgrade of data center cooling unnecessary and allowing more investment in compute.