In this video from ISC 2016, Figen Ulgen from Intel describes the new Intel HPC Orchestrator. “Intel HPC Orchestrator simplifies the installation, management and ongoing maintenance of a high-performance computing system by reducing the amount of integration and validation effort required for the HPC system software stack. Intel HPC Orchestrator can help accelerate your time to results and value in your HPC initiatives. With Intel HPC Orchestrator, based on the OpenHPC system software stack, you can take advantage of the innovation driven by the open source community – while also getting peace of mind from Intel support across the stack.”
Today Allinea Software announced that the Joint Center for Advanced High Performance Computing (JCAHPC) in Japan will use the Allinea DDT debugger for its new supercomputer. Coming online in December 2016, the new supercomputer, known as Oakforest-PACS, will be the fastest supercomputer system in Japan with 25 PFLOPS on Intel’s Xeon Phi (Knights Landing) manycore processors and the Omni-Path architecture.
The vector parallel capabilities of the Intel Xeon Phi coprocessor are similar in many ways with vectorizing code for the main CPU. The performance improvement when coding smartly and using the tools available can be tremendous. Since the Intel Xeon Phi coprocessor can show very large gains in performance due to its extra wide processing units. “Although it is time consuming to look at each and every loop in a large application, by doing so, and both telling the compiler what to do, and letting the compiler do its work, performance increases can be quite large, leading to shorter run times and/or more complete results.”
The Numerical Algorithms Group (NAG) has engineered NAG C Library algorithms to execute efficiently on Cavium ThunderX ARMv8-A based Workload Optimized Processors. Preliminary results, announced at ISC 2016, show excellent scaling across 96 cores of ThunderX in a dual socket configuration.
Today the Linux Foundation announced a set of technical, leadership and member investment milestones for OpenHPC, a Linux Foundation project to develop an open source framework for High Performance Computing environments. “The OpenHPC community has quickly paved a path of collaborative development that is highly inclusive of stakeholders invested in HPC-optimized software,” said Jim Zemlin, executive director, The Linux Foundation. “To see OpenHPC members include the world’s leading computing labs, universities, and hardware experts, illustrates how open source unites the world’s leading technologists to share technology investments that will shape the next 30+ years of computing.”
Today the Numerical Algorithms Group (NAG) has announced the NAG Software Modernization Service. The new service solves the porting and performance challenges faced by customers wishing to use the capabilities of modern computing systems, such as multi-core CPUs, GPUs and Xeon Phi. NAG HPC software engineering experts modernize the code to enable portability to appropriate architectures, optimize for performance and assure robustness.
SC16 has announced the winner of their Test of Time Award. This year the winning paper “Automatically Tuned Linear Algebra Software” by Clint Whaley and Jack Dongarra. The paper, which has received hundreds of citations with new citations still appearing, is about ATLAS – an autotuning, optimized implementation of the Basic Linear Algebra Subprograms (BLAS).
David Bolton from Slashdot shows how ‘embarrassingly parallel’ code can be sped up over 2000x (not percent) by utilizing Intel tools including the Intel Python compiler and OpenMP. “The Intel Distribution for Python* 2017 Beta program is now available. The Beta product adds new Python packages like scikit-learn, mpi4py, numba, conda, tbb (Python interfaces to Intel Threading Building Blocks) and pyDAAL (Python interfaces to Intel Data Analytics Acceleration Library). “
Intel is offering a 4-part summer series of developer training workshops at Stanford University to introduce high performance computing tools.
Chris Mason from Acceleware presented this talk at GTC 2016. “This session will focus on real life examples including an RF powered contact lens, a wireless capsule endoscopy, and a smart watch. The session will also outline the basics of the subgridding algorithm along with the GPU implementation and the development challenges. Performance results will illustrate the significant reduction in computation times when using a localized subgridded mesh running on an NVIDIA Tesla GPU.”