In this video from the HPC Advisory Council Spain Conference, Dan Olds from OrionX discusses the High Performance Interconnect (HPI) market landscape, plus provides ratings and rankings of HPI choices today. “In this talk, we’ll take a look at the technologies and performance of high-end networking technology and the coming battle between onloading vs. offloading interconnect architectures.”
The big data analytics market has seen rapid growth in recent years. Part of this trend includes the increased use of machine learning (Deep Learning) technologies. Indeed, machine learning speed has been drastically increased though the use of GPU accelerators. The issues facing the HPC market are similar to the analytics market — efficient use of the underlying hardware. A position paper from the third annual Big Data and Extreme Computing conference (2015) illustrates the power of co-design in the analytics market.
Achieving better scalability and performance at Exascale will require full data reach. Without this capability, onload architectures force all data to move to the CPU before allowing any analysis. The ability to analyze data everywhere means that every active component in the cluster will contribute to the computing capabilities and boost performance. In effect, the interconnect will become its own “CPU” and provide in-network computing capabilities.
The move to network offloading is the first step in co-designed systems. A large amount of overhead is required to service the huge number of packets required for modern data rates. This amount of overhead can significantly reduce network performance. Offloading network processing to the network interface card helped solve this bottleneck as well as some others.
Coming in the second half of 2016: The HPE Apollo 6500 System provides the tools and the confidence to deliver high performance computing (HPC) innovation. The system consists of three key elements: The HPE ProLiant XL270 Gen9 Server tray, the HPE Apollo 6500 Chassis, and the HPE Apollo 6000 Power Shelf. Although final configurations and performance are not yet available, the system appears capable of delivering over 40 teraflop/s double precision, and significantly more in single or half precision modes.
In this video from the 4th Annual MVAPICH User Group, DK Panda from Ohio State University presents: Overview of the MVAPICH Project and Future Roadmap. “This talk will provide an overview of the MVAPICH project (past, present and future). Future roadmap and features for upcoming releases of the MVAPICH2 software family (including MVAPICH2-X, MVAPICH2-GDR, MVAPICH2-Virt, MVAPICH2-EA and MVAPICH2-MIC) will be presented. Current status and future plans for OSU INAM, OEMT and OMB will also be presented.”
“When the history of HPC is viewed in terms of technological approaches, three epochs emerge. The most recent epoch, that of co-design systems, is new and somewhat unfamiliar to many HPC practitioners. Each epoch is defined by a fundamental shift in design, new technologies, and the economics of the day. “A network co-design model allows data algorithms to be executed more efficiently using smart interface cards and switches. As co-design approaches become more mainstream, design resources will begin to focus on specific issues and move away from optimizing general performance.”
“The ExaFlash Platform is an historic achievement that will reshape the storage and data center industries,” said Thomas Isakovich, CEO and Founder of Nimbus Data. “It offers unprecedented scale (from terabytes to exabytes), record-smashing efficiency (95% lower power and 50x greater density than existing all-flash arrays), and a breakthrough price point (a fraction of the cost of existing all-flash arrays). ExaFlash brings the all-flash data center dream to reality and will help empower humankind’s innovation for decades to come.”
Today the Green500 released their listing of the world’s most energy efficient supercomputers. “Japan’s research institution RIKEN once again captured the top spot with its Shoubu supercomputer. With rating of 6673.84 MFLOPS/Watt, Shoubu edged out another RIKEN system, Satsuki, the number 2 system that delivered 6195.22 MFLOPS/Watt. Both are “ZettaScaler”supercomputers, employing Intel Xeon processors and PEZY-SCnp manycore accelerators.
Olaf Weber from SGI presented this talk at LUG 2016. “In collaboration with Intel, SGI set about creating support for multiple network connections to the Lustre filesystem, with multi-rail support. With Intel Omni-Path and EDR Infiniband driving to 200Gb/s or 25GB/s per connection, this capability will make it possible to start moving data between a single SGI UV node and the Lustre file system at over 100GB/s.”