“The move away from the traditional single processor/memory design has fostered new programming paradigms that address multiple processors (cores). Existing single core applications need to be modified to use extra processors (and accelerators). Unfortunately there is no single portable and efficient programming solution that addresses both scale-up and scale-out systems.”
“With three primary network technology options widely available, each with advantages and disadvantages in specific workload scenarios, the choice of solution partner that can deliver the full range of choices together with the expertise and support to match technology solution to business requirement becomes paramount.”
The two methods of scaling processors are based on the method used to scale the memory architecture and are called scaling-out or scale-up. Beyond the basic processor/memory architecture, accelerators and parallel file systems are also used to provide scalable performance. “High performance scale-up designs for scaling hardware require that programs have concurrent sections that can be distributed over multiple processors. Unlike the distributed memory systems described below, there is no need to copy data from system to system because all the memory is globally usable by all processors.”
The TOP500 list is a very good proxy for how different interconnect technologies are being adopted for the most demanding workloads, which is a useful leading indicator for enterprise adoption. The essential takeaway is that the world’s leading and most esoteric systems are currently dominated by vendor specific technologies. The Open Fabrics Alliance (OFA) will be increasingly important in the coming years as a forum to bring together the leading high performance interconnect vendors and technologies to deliver a unified, cross-platform, transport-independent software stack.
To achieve high performance, modern computer systems rely on two basic methodologies to scale resources: scale-up or scale-out. The scale-up in-memory system provides a much better total cost of ownership and can provide value in a variety of ways. “If the application program has concurrent sections then it can be executed in a “parallel” fashion. Much like using multiple bricklayers to build a brick wall. It is important to remember that the amount and efficiency of the concurrent portions of a program determine how much faster it can run on multiple processors. Not all applications are good candidates for parallel execution.”
With the advent of heterogeneous computing systems that combine both main CPUs and connected processors that can ingest and process tremendous amounts of data and run complex algorithms, artificial intelligence (AI) technologies are beginning to take hold in a variety of industries. Massive datasets can now be used to drive innovation in industries such as autonomous driving systems, controlling power grids and combining data to arrive at a profitable decision, for example. Read how AI can now be used in various industries using the latest hardware and software.
Today, high performance interconnects can be divided into three categories: Ethernet, InfiniBand, and vendor specific interconnects. Ethernet is established as the dominant low level interconnect standard for mainstream commercial computing requirements. InfiniBand originated in 1999 to specifically address workload requirements that were not adequately addressed by Ethernet, and vendor specific technologies frequently have a time to market (and therefore performance) advantage over standardized offerings.
A survey conducted by insideHPC and Gabriel Consulting in Q4 of 2105 indicated that nearly 45% of HPC and large enterprise customers would spend more on system interconnects and I/O in 2016, with 40% maintaining spending at the same level as the prior year. For manufacturing, the largest subset representing approximately one third of the respondents, over 60% were planning to spend more and almost 30% maintaining the same level of spending going into 2016 implying the critical value of high performance interconnects.
SGI’s Data Management Framework (DMF) software – when used within personalized medicine applications – provides a large-scale, storage virtualization and tiered data management platform specifically engineered to administer the billions of files and petabytes of structured and unstructured fixed content generated by highly scalable and extremely dynamic life sciences applications.
Accelerated computing continues to gain momentum as the HPC community moves towards Exascale. Our recent Tesla P100 GPU review shows how these accelerators are opening up new worlds of performance vs. traditional CPU-based systems and even vs. NVIDIA’s previous K80 GPU product. We’ve got benchmarks, case studies, and more in the insideHPC Research Report on GPU Accelerators.