“In this presentation, we will discuss several important goals and requirements of portable standards in the context of OpenMP. We will also encourage audience participation as we discuss and formulate the current state-of-the-art in this area and our hopes and goals for the future. We will start by describing the current and next generation architectures at NERSC and OLCF and explain how the differences require different general programming paradigms to facilitate high-performance implementations.”
The High Performance Conjugate Gradients (HPCG) benchmark continues to gain traction in the high-performance computing community. “HPCG is designed to complement the traditional High Performance Linpack (HPL) benchmark used as the official metric for ranking the top 500 systems,” said Sandia National Laboratories researcher Mike Heroux, who developed the HPCG program in collaboration with Jack Dongarra and Piotr Luszczek from the University of Tennessee.
Scientists from the Heat and Mass Transfer Technological Center (CTTC) at the Technical University of Catalonia in Spain have harnessed the extreme performance of the Mira supercomputer with their in-house multi-physics CFD code as a result of collaboration on scalable debugging for the high-end system between Allinea Software and Argonne National Laboratory.
“The goal of each hackathon is for current or prospective user groups of large hybrid CPU-GPU systems to send teams of at least 3 developers along with either (1) a (potentially) scalable application that needs to be ported to GPU accelerators, or (2) an application running on accelerators which needs optimization. There will be intensive mentoring during this 5-day hands-on workshop, with the goal that the teams leave with applications running on GPUs, or at least with a clear roadmap of how to get there. Our mentors come from national laboratories, universities and vendors, and besides having extensive experience in programming with OpenACC/CUDA, many of them develop the GPU-capable compilers and help define the OpenACC standard.”
Today Allinea released version 6.0 of their HPC development tools suite Allinea Forge and Performance Reports. Building on their commitment to serving the scientific HPC community, Allinea demonstrated the new features at SC15 last month in Austin.
“OpenHPC is a collaborative, community effort that initiated from a desire to aggregate a number of common ingredients required to deploy and manage High Performance Computing Linux clusters including provisioning tools, resource management, I/O clients, development tools, and a variety of scientific libraries. Packages provided by OpenHPC have been pre-built with HPC integration in mind with a goal to provide re-usable building blocks for the HPC community.”
“Just as representative benchmarks like HPCG are set to replace Linpack, so a focus on software is taking over. From industry analysts to users at SC15 we heard that software is the number one challenge and the number one opportunity to have world-class impact.”
“We’ve tailored our story for the HPC developers here, who are really worried about applications and performance of applications. What’s really happened traditionally is that the single-threaded applications had not really been able to take advantage of the multi-core processor-based server platforms. So they’ve not really been getting the optimized platform and they’ve been leaving money on the table, so to speak. Because when you can optimize your applications for parallelism, you can take advantage of these multi-processor server platform. And you can get sometimes up to 10x performance boost, maybe sometime 100x, we’ve seen some financial services applications, or 3x for chemistry types of simulations as an example.”
In this video from SC15, Rich Brueckner from insideHPC talks to contestants in the Student Cluster Competition. Using hardware loaners from various vendors and Allinea performance tools, nine teams went head-to-head to build the fastest HPC cluster.
Today Russia’s RSC Group announced that Team TUMuch Phun from the Technical University of Munich (TUM) won the Highest Linpack Award in the SC15 Student Cluster Competition. The enthusiastic students achieved 7.1 Teraflops on the Linpack benchmark using an RSC PetaStream cluster with computing nodes based on Intel Xeon Phi. TUM student team took 3rd place in overall competition within 9 teams participated in SCC at SC15, so as only one European representative in this challenge.