In this special guest feature, Bill Mannel from Hewlett Packard Enterprise writes that upcoming Intel HPC Developer Conference in Salt Lake City is a great opportunity to learn about code modernization for the next generation of high performance computing applications. “As computing systems grow increasingly complex and new architecture designs become mainstream, training developers to write code which runs on future HPC systems will require a collaborative environment and the expertise of the best and brightest in the industry.”
Today the Numerical Algorithms Group announces the latest version of its flagship software, the NAG Library, Mark 26. In this release, NAG has introduced an Optimization Modeling Suite for linear and nonlinear semidefinite programming and general nonlinear programming. It also features new routines in the important computational areas of Nearest Correlation Matrix and Quadrature.
In this video from the HPC Advisory Council Spain Conference, Martin Hilgeman from Dell Technologies provides a detailed overview of how to approach code optimization through providing more parallelism. “Martin Hilgeman brings perspectives of a system builder to the massively parallel performance discussion – examining the continuous advances in multi-core architectures and its impact on users and computational work.”
“Science problems are becoming increasingly complex in all areas from physics and bioinformatics to engineering,” said Siegfried Hoefinger, High Performance Computing Specialist at VSC explains. “Bigger is better, but inefficiency will always limit what you can achieve. The Allinea tools will enable us to quickly establish the root cause of bottlenecks and understand the markers for inefficient code. By doing so we’re helping to prove the case for modernization, can start to eliminate inefficiencies and exploit latent capacity to its full effect.”
Sure, your code seems fast, but how do you know if you are leaving potential performance on the table? Recognized HPC experts Georg Hager and Gerhard Wellein will teach a tutorial on Node-Level Performance Engineering at SC16. The session will take place 8:30-5:00pm on Sunday, Nov. 13 in Salt Lake City.
This week Minimal Metrics announced an early-adopter program for PerfMiner, which uses lightweight, and pervasive performance data collection technology, automates its collection, and mines the data for key performance indicators. These indicators were developed through Minimal Metrics’ extensive experience tuning HPC and enterprise application performance, presented in an audience-specific, drill-down hierarchy that provides accountability for site productivity down to the performance of individual application threads.
Vectorization and threading are critical to using such innovative hardware product such as the Intel Xeon Phi processor. Using tools early in the design and development processor that identify where vectorization can be used or improved will lead to increased performance of the overall application. Modern tools can be used to determine what might be blocking compiler vectorization and the potential gain from the work involved.
Six application development teams from NERSC gathered at Intel in early August for a marathon “dungeon session” designed to help tweak their codes for the next-generation Intel Xeon Phi Knight’s Landing manycore architecture and NERSC’s new Cori supercomputer. “We try to prepare ahead of time to bring the types of problems that can only be solved with the experts at Intel and Cray present—deep questions about the architecture and how applications use the Xeon Phi processor. It’s all geared toward optimizing the codes to run on the new manycore architecture and on Cori.”
“Parallel software and parallel hardware, used together will give the best results for an application. If the application is serial in nature, and the processor is serial, then there will obviously not be a great gain in performance. When the application is parallelized, but the processor is serial, again, no great gain. A third combination is when the application is serial and the processing is parallel. Since the application cannot take advantage of the increased power of the hardware, there will not be a great performance boost. The best and really only solution is to modify the application to run in parallel, using high performing parallel hardware.”
A research team at the Ohio Supercomputer Center (OSC) is beginning the task of modernizing a computer software package that leverages large-scale, 3-D modeling to research fatigue and fracture analyses, primarily in metals. “The research is a result of OSC being selected as an Intel Parallel Computing Center. The Intel PCC program provides funding to universities, institutions and research labs to modernize key community codes used across a wide range of disciplines to run on current state-of-the-art parallel architectures. The primary focus is to modernize applications to increase parallelism and scalability through optimizations that leverage cores, caches, threads and vector capabilities of microprocessors and coprocessors.”