Today Allinea Software launched the first update to its well-established toolset for debugging, profiling and optimizing high performance code since being acquired by ARM in December 2016. “The V7.0 release provides new integrations for the Allinea Forge debugger and profiler and Allinea Performance Reports and will mean more efficient code development and optimization for users, especially those wishing to take software performance to new levels across Xeon Phi, CUDA and IBM Power platforms,” said Mark O’Connor, ARM Director, Product Management HPC tools.
“Writing and deploying software that exploits the ever increasing computing power of clusters and supercomputers is a demanding challenge – it needs to run fast, and run right, and that’s exactly what our suite of tools is designed to enable,” said David Lecomber, CEO, Allinea. “As part of ARM, we’ll continue to work with the HPC community, our customers and our partners to advance the development of our cross-platform technology, and take advantage of product synergies between ARM’s compilers, libraries and advisory tools and our existing and future debugging and analysis tools. Our combined expertise and understanding of the challenges this market faces will deliver new solutions to this growing ecosystem.”
“The majority of deep learning frameworks provide good out-of-the-box performance on a single workstation, but scaling across multiple nodes is still a wild, untamed borderland. This discussion follows the story of one researcher trying to make use of a significant compute resource to accelerate learning over a large number of CPUs. Along the way we note how to find good multiple-CPU performance with Theano* and TensorFlow*, how to extend a single-machine model with MPI and optimize its performance as we scale out and up on both Intel Xeon and Intel Xeon Phi architectures.”
Are supercomputers practical for Deep Learning applications? Over at the Allinea Blog, Mark O’Connor writes that a recent experiment with machine learning optimization on the Archer supercomputer shows that relatively simple models run at sufficiently large scale can readily outperform more complex but less scalable models. “In the open science world, anyone running a HPC cluster can expect to see a surge in the number of people wanting to run deep learning workloads over the coming months.”
“Being ready with full support for Intel Xeon Phi from day one has been a key strategy for Allinea and underpins our approach for supporting customers, such as Los Alamos National Laboratory on the Trinity system, Argonne National Laboratory on Theta and NERSC on Cori, where work is now underway to port code and get applications ready for more complex science on a larger scale.”
“Science problems are becoming increasingly complex in all areas from physics and bioinformatics to engineering,” said Siegfried Hoefinger, High Performance Computing Specialist at VSC explains. “Bigger is better, but inefficiency will always limit what you can achieve. The Allinea tools will enable us to quickly establish the root cause of bottlenecks and understand the markers for inefficient code. By doing so we’re helping to prove the case for modernization, can start to eliminate inefficiencies and exploit latent capacity to its full effect.”
Today Allinea Software announces availability of its new software release, version 6.1, which offers full support for programming parallel code on the Pascal GPU architecture, CUDA 8 from Nvidia. “The addition of Allinea tools into the mix is an exciting one, enabling teams to accurately measure GPU utilization, employ smart optimization techniques and quickly develop new CUDA 8 code that is bug and bottleneck free,” said Mark O’Connor, VP of Product Management at Allinea.
Today, Allinea announced that the company will be exhibiting at XSEDE16 July 17-21 in Miami. The conference will attract an audience across industry and academia to discuss the key themes of diversity, big data and science at scale. “Our tools are used extensively across the XSEDE user base so we’re delighted to be extending the value they bring by giving practical advice for getting the best out of infrastructure capabilities through software tuning, especially given the addition of support for the full Intel Xeon Phi family in our new v6.1 software release,” said Rob Rick, VP Americas for Allinea.”
“Our latest product enhancements will solidify our customers’ investment in the next generation Intel Xeon Phi processor,” said Mark O’Connor, VP Product Management at Allinea. “Knights Landing’ has the potential to unleash new capabilities for HPC code users and our new release brings a powerful debugger, profiler and performance reports for tackling the essential preparatory work needed to optimize legacy code and realize the processor’s true potential for reducing software run times.”
The Flemish Supercomputer Center (VSC) is planning the deployment of a new NEC cluster that will represent Belgium’s largest investment in HPC to date. To help VSC unleash the potential of the system, Allinea software tools will be used to speed up code performance. “We are delighted to be supporting VSC in providing better education to its users around code efficiency,” said David Lecomber, CEO and Founder of Allinea. “The fact of the matter is, without visibility of code performance, researchers cannot get the full value from HPC. By appreciating how their code makes a difference to project delivery, researchers can achieve more for less cost. By underlining this best practice, VSC’s approach is one that is refreshing and makes great economic sense.”