Intel's OpenCL Autovectorization Boosts Performance without User Intervention

September 29, 2011 by Doug Black

Intel’s Nadav Rotem writes that the company’s newly released OpenCL SDK version 1.5 features one improvement that is very important but not always visible to the user: the new Implicit CPU Vectorization module.

What are the benefits of using the implicit CPU vectorization module? SIMD instructions expose a high-level of parallelism and are used to accelerate the performance of data-parallel applications in multiple domains. The 2nd Generation Intel® Core Processor Family codenamed “Sandy Bridge”, features the Intel® AVX instruction set, which has 8 wide floating point SIMD processing. Applications which take advantage of SIMD instructions can run as much as 8x faster. For example in Intel® AVX, the instruction “vaddps” performs an addition of 8 floating point numbers in parallel. The Implicit CPU vectorization module seamlessly compiles your OpenCL kernels to fully utilize the full 8 wide floating point SIMD processing, boosting the performance of user code without user intervention.

Read the Full Story.

Intel's OpenCL Autovectorization Boosts Performance without User Intervention

Sponsored Guest Articles

Lenovo and NVIDIA at GTC 2024: An Alliance Enabling AI at Scale

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA

Intel's OpenCL Autovectorization Boosts Performance without User Intervention

Sponsored Guest Articles

Lenovo and NVIDIA at GTC 2024: An Alliance Enabling AI at Scale

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Related Posts

Featured RSS Feed

More News from insideBIGDATA