The Open Compute Project, initiated by Facebook as a way to increase computing power while lowering associated costs with hyper-scale computing, has gained a significant industry following. This guide to Open Computing, from the editors of insideHPC, is a series of articles design to help organizations optimize their HPC environment to achieve higher performance at a lower operating cost. The complete ‘Guide to Open Computing’ is available for download from the insideHPC White Paper Library.
Many organizations struggle with implementing technologies to solve their most complex computing challenges. Current algorithms and those in development often require the use of tens to hundreds to thousands of computing elements. While computer processing power continues to increase each year, constraints remain for IT staff needing to satisfy their users’ high-performance computing (HPC) requirements.
While the specific definition of HPC remains elusive, a standard definition refers to the practice of aggregating computing power greatly in excess of an individual computer or workstation. To make this work, network technologies have been developed to link numerous computers together (servers) to solve specific workload requirements. In addition to the computational and networking capability, algorithms have been developed to ensure these distributed resources can be harnessed efficiently and at scale.
Many systems vendors work with end users to customize their high-end computing requirements. However, there are risks with this approach. If a solution is designed and implemented with one vendor, the customer may be locked in to using solutions from that vendor in the future, preventing them from moving to the best next-generation solution. Ideally, as HPC building-block technologies improve, customers can capture efficiencies across the HPC landscape to create an optimized solution with higher performance at lower operating costs.
A modern HPC environment consists of (from the inside out) cores in the CPU chip itself, sockets from the major CPU (Intel, AMD, ARM) vendors that connect CPUs to printed circuit boards, metal sheets that back circuit boards so they can be rack-mounted and interconnected, and the connection of the circuit board racks to any number of storage systems. Once in place, it is difficult to replace any of the components with newer or competing technologies. Replacing a server from one vendor with a server from a different vendor can be accomplished, but only if the servers are the same size exactly, and have the same power draw, the same connectivity options and the same cooling requirements, among other parameters.
Over the next few weeks we’ll explore the following areas of Open Computing:
- Open Computing: Opportunities for increased efficiency
- Open Computing: Density and Power
- Open Computing in Industry Segments
- Open Computing: Vendor Landscape