Intel to Distribute SUSE High Performance Computing Stack

“The SUSE and Intel collaboration on Intel HPC Orchestrator and OpenHPC puts this power within reach of a whole new range of industries and enterprises that need data-driven insights to compete and advance. This is an industry-changing approach that will rapidly accelerate HPC innovation and advance the state of the art in a way that creates real-world benefits for our customers and partners.”

Video: Announcing Intel HPC Orchestrator

In this video from ISC 2016, Figen Ulgen from Intel describes the new Intel HPC Orchestrator. “Intel HPC Orchestrator simplifies the installation, management and ongoing maintenance of a high-performance computing system by reducing the amount of integration and validation effort required for the HPC system software stack. Intel HPC Orchestrator can help accelerate your time to results and value in your HPC initiatives. With Intel HPC Orchestrator, based on the OpenHPC system software stack, you can take advantage of the innovation driven by the open source community – while also getting peace of mind from Intel support across the stack.”

Allinea DDT Debugger to be Used for 25 Petaflop Supercomputer at JCAHP in Japan

Today Allinea Software announced that the Joint Center for Advanced High Performance Computing (JCAHPC) in Japan will use the Allinea DDT debugger for its new supercomputer. Coming online in December 2016, the new supercomputer, known as Oakforest-PACS, will be the fastest supercomputer system in Japan with 25 PFLOPS on Intel’s Xeon Phi (Knights Landing) manycore processors and the Omni-Path architecture.

Helping the Compiler Speed Intel Xeon Phi

The vector parallel capabilities of the Intel Xeon Phi coprocessor are similar in many ways with vectorizing code for the main CPU. The performance improvement when coding smartly and using the tools available can be tremendous. Since the Intel Xeon Phi coprocessor can show very large gains in performance due to its extra wide processing units. “Although it is time consuming to look at each and every loop in a large application, by doing so, and both telling the compiler what to do, and letting the compiler do its work, performance increases can be quite large, leading to shorter run times and/or more complete results.”

NAG Optimizes C and C++ Algorithms for ARM-based Cavium ThunderX Processors

The Numerical Algorithms Group (NAG) has engineered NAG C Library algorithms to execute efficiently on Cavium ThunderX ARMv8-A based Workload Optimized Processors. Preliminary results, announced at ISC 2016, show excellent scaling across 96 cores of ThunderX in a dual socket configuration.

OpenHPC Establishes Leadership & Releases Initial Software Stack

Today the Linux Foundation announced a set of technical, leadership and member investment milestones for OpenHPC, a Linux Foundation project to develop an open source framework for High Performance Computing environments. “The OpenHPC community has quickly paved a path of collaborative development that is highly inclusive of stakeholders invested in HPC-optimized software,” said Jim Zemlin, executive director, The Linux Foundation. “To see OpenHPC members include the world’s leading computing labs, universities, and hardware experts, illustrates how open source unites the world’s leading technologists to share technology investments that will shape the next 30+ years of computing.”

NAG Rolls Out Software Modernization Service

Today the Numerical Algorithms Group (NAG) has announced the NAG Software Modernization Service. The new service solves the porting and performance challenges faced by customers wishing to use the capabilities of modern computing systems, such as multi-core CPUs, GPUs and Xeon Phi. NAG HPC software engineering experts modernize the code to enable portability to appropriate architectures, optimize for performance and assure robustness.

1998 ATLAS Paper Wins SC16 Test of Time Award

SC16 has announced the winner of their Test of Time Award. This year the winning paper “Automatically Tuned Linear Algebra Software” by Clint Whaley and Jack Dongarra. The paper, which has received hundreds of citations with new citations still appearing, is about ATLAS – an autotuning, optimized implementation of the Basic Linear Algebra Subprograms (BLAS).

Video: Speeding Up Code with the Intel Distribution for Python

David Bolton from Slashdot shows how ‘embarrassingly parallel’ code can be sped up over 2000x (not percent) by utilizing Intel tools including the Intel Python compiler and OpenMP. “The Intel Distribution for Python* 2017 Beta program is now available. The Beta product adds new Python packages like scikit-learn, mpi4py, numba, conda, tbb (Python interfaces to Intel Threading Building Blocks) and pyDAAL (Python interfaces to Intel Data Analytics Acceleration Library). “

Intel Developer Summer Workshops Coming to Stanford

Intel is offering a 4-part summer series of developer training workshops at Stanford University to introduce high performance computing tools.