Interview: AutoTune – Automated Optimization and Tuning

autotune_logoRecently, I had the pleasure to conduct a virtual (email) interview with Professor Michael Gerndt at the Technische Universität München to understand more about the projects they are involved in. Modernizing and optimizing applications are key to taking advantage of the latest hardware innovations such as the number of cores on a main CPU as well as the hundreds to thousands of cores becoming available on various accelerators. Below is the discussion with Dr. Michael Gerndt.

insideHPC: What is AutoTune?

Michael Gerndt, Technische Universität München

Michael Gerndt, Technische Universität München

Michael Gerndt: AutoTune is a European Commission (EC) funded FP7 research project comprised of an international consortium of scientific institutions and industrial partners coordinated by Technische Universität München (TUM), Germany. It started in October 2011 and ends in April 2015.

The main goal of AutoTune is the automatic optimization of applications in the area of HPC, targeting both performance optimization and energy efficiency. AutoTune implemented the Periscope Tuning Framework of which the R1.1 release is available as open source.

Next to TUM, the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences, the Universitat Autònoma de Barcelona (UAB), the Centre for High-End Computing (ICHEC) at the University of Galway as well as the University of Vienna are partners in the project with IBM as associated partner.

insideHPC: What are the expected goals in terms of performance improvement, efficiency?

Michael Gerndt: Any manual tuning of an application has at its very core some kind of a search strategy.  One first chooses an aspect, which should be tuned and then different settings and optimization configurations are tested until some preset target is reached or no further improvements can be found.

The goals of AutoTune are to

  • automate the tuning process
  • use expert knowledge to speedup tuning and to improve its results
  • target a variety of tuning aspects, see examples below
  • provide support for standard HPC programming interface
  • enable tuning for homogeneous as well as accelerated HPC systems.

Examples of tuning aspects are selecting the best combination of compiler flags or the tuning of parameters of the MPI library. A typical example is the threshold determining the communication protocol, the so-called eager limit. A larger threshold allows sending longer messages faster but requires considerable additional memory in the MPI library. The best value depends on the size of messages sent in the application. AutoTune recently demonstrated improvements of up to 38% by selecting the right compiler flags and up to 60% from tuning the MPI library parameters.

In addition to performance tuning, HPC centers are focusing more and more on energy efficiency. Besides using techniques like hot water cooling to reduce energy consumption for producing cold water, applications can be tuned to use less energy by selecting the right clock frequency for the processor. Tuning application energy efficiency is based in on novel models which estimate energy consumption for different CPU frequencies and voltage settings. Evaluations show that, for selected applications, the energy consumption can be reduced by as much as 40%.

insideHPC: Which domains are you targeting ? Any industries in specific ?

Michael Gerndt: The diverse set of tuning aspects provided in AutoTune makes it applicable to a very broad range of application domains and of industries. Our pool of applications includes bio informatics application relevant to medicine as well as industry codes from Oil and Gas companies and metal forming codes from automobile industries.

insideHPC: What types of architectures are you working on? Distributed? SMP?

Michael Gerndt: In AutoTune we mainly target distributed memory architectures, with shared memory nodes including GPU accelerated heterogeneous environments. We successfully ran tuning experiments on different machines, ranging from supercomputers, to computing clusters and commodity systems. Our main testing platform is the SuperMUC supercomputer at the Leibniz Supercomputing Centre (LRZ) in Garching near Munich, Germany.

insideHPC: Are there specific tools that you are using or promoting? Intel software tools? Others?

Michael Gerndt: AutoTune developed its own tool, the Periscope Tuning Framework (PTF), which is publicly available under the new BSD license. PTF has a plugin-based architecture and comprises of the Periscope online analysis tool and several tuning plugins. Each plugin is specialized for a specific tuning aspect, for example the compiler flags selection or energy efficiency.

PTF is highly flexible and extensible: new plugins can be developed based on the given Tuning Interface, and the default set of performance properties can be extended with custom plugin-properties as needed in particular tuning strategies.

Worth noticing is that PTF will be provided with continuous developer and user support, thanks to the AutoTune Demonstration Center (ADC). ADC was established by the AutoTune partners to further disseminate best practice guides for the use of the AutoTune tools, to provide user training and to further extend the PTF developer community.

Sign up for our insideHPC Newsletter.