Modernizing Code with the Intel Vectorization Advisor

Triple-Performance-Enabled-as-Intel-Releases-Haswell-EP-Xeon-E5-2600-1600-v3-CPUs-458120-4Threading and vectorizing an application are two techniques that are known to increase the performance of an application using modern CPUs and coprocessors. However, a deep understanding of the application is needed in order to make the decisions needed and to rewrite portions of the application to take advantage of these techniques. In cases where the developer might not be familiar with the code an automated tools such as the Intel Vectorization Advisor can assist the developer.

Threading plus vectorization together can increase the performance of an application more than one technique or the other. Some benchmarks that can be threaded and vectorized show a performance gain of over 10X from processors released from 2007 to 2014. There are different methods to adding vectorization to an application. These are:

  • Rely on the compiler
  • Adding pragmas or directives to the application
  • Use libraries that are vectorized for certain functions
  • Hand coding when applicable

Various development techniques can be used to optimize an application. A life cycle of development could contain two main techniques for this type of experimentation, which might be needed to be performed over and over.

  • Profile and Diagnose – Build an optimized version and run the application and look for hotspots.
  • Analyze and Advise – Look at the statistics for both CPU and memory access and rework parts of the code that are not optimal.

Tools are available to help the developer to identify the hotspots, and then advise the loops or sections of the code that could benefit from more optimization. Using the DL_MESO application, vectorizing the easy-to-identify hotspots improved the performance by about 20 %. In addition, some advisor tools can help determine the performance gain in certain areas of the application. It is important to use the tools available to assist with vectorization, in addition to understanding where the application can be threaded. These techniques will work in many cases, and can produce faster run times, even for quite simple modifications.

Source:  STFC Daresbury, UK; Intel, Russia; Intel, USA

Transform data into opportunity. Speed data analysis in your applications.

Modernize your code with Intel® Parallel Studio XE