Intel Compilers 18.0 Tune for AVX-512 ISA Extensions

Print Friendly, PDF & Email

Sponsored Post

The latest Intel® Xeon® Scalable processors feature extensions to the x86 instruction set for vectorization and other advanced optimizations. Vectorization – a single machine instruction operating over multiple data, or SIMD – can speed up the performance of applications in many fields, from scientific simulation and financial analysis, to artificial intelligence, image processing, cryptography, and more. In fact, any problem that computes over gobs of data can benefit from the Intel AVX-512 Instruction Set Architecture (ISA) supported by these processors.

However, these great new innovations in the hardware architecture are dormant unless the compiler you are using can generate optimized code that employs these new instructions.

Which is why the latest Intel compilers and tuning software fully support the AVX-512 instructions.

By widening and deepening the vector registers, the new instructions and added enhancements let the compiler squeeze more vector parallelism out of applications than before. Applications compiled with the –xCORE-AVX512 option will generate an executable that utilizes the new instructions. But it will not run on non-Intel processors or older Intel processors. The compilers do provide a way to generate a “fat binary” with alternative code paths selected at run time for executing on processors that do not support AVX512.[clickToTweet tweet=”The latest Intel Compilers 18.0 and tuning software fully support the AVX-512 instruction set.” quote=”The latest Intel Compilers 18.0 and tuning software fully support the AVX-512 instruction set.”]

Intel Compilers 18.0 support explicit programmer-controlled vectorization (and parallelization) using OpenMP* 4.0 and 4.5  simd directives in C/C++ and Fortran.   The compilers also recognize certain programming patterns, or idioms, such as array compress and expand operations, which typically do not parallelize. These patterns can be vectorized using new compress and expand instructions, which the compilers will recognize and generate code for simple cases. In more complicated situations, highly optimized code can be generated when guided by the programmer’s explicit simd directives.

With the appropriate use of directives, programmers can identify other idioms in the code, such as histogram counting, conditional last-private loops, and loops with early exits, that are not easily vectorized or optimized. With directives, the compilers can now optimize these idioms and loops better than before. And several new simd extensions for C/C++ are under consideration to take even greater advantage of AVX-512 through OpenMP and the C++ Parallel STL.

The Intel Compilers 18.0 are an integral part of Intel Parallel Studio XE 2018, which also includes tools for performance analysis and tuning, along with highly optimized math libraries that support the new instruction sets. You can use Intel Advisor, one of the sophisticated analysis tools in Intel Parallel Studio XE 2018, to point out how and where utilizing AVX-512 features can improve application code  performance.

Developers concerned about application performance will want to keep up with the latest processor architectures, and to adapt their codes to future platforms as they become available. But keeping up is a huge challenge. The performance to be gained does not come automatically. In fact, some applications may run without any speedup, or even run slower on the new processors without the appropriate choice of compiler options and directives. The good news is that the compilers and analysis tools in Intel Parallel Studio XE 2018 help you meet that challenge.

Download your free 30-day trial of Intel® Parallel Studio XE 2018