Over at Dr. Dobbs, Rob Farber continues his series of tutorials on OpenACC with an introduction to parallel regions and how the gang, worker, and vector clauses affect the execution model.
In a nutshell, OpenACC parallel regions are useful because they let programmers annotate code in a style that is conceptually very similar to OpenMP. Kernel regions allow the compiler to automatically generate CUDA-style kernels, which gives advanced programmers the ability to express any CUDA kernel launch configuration using portable directive-based OpenACC syntax. Engineers at PGI wrote an excellent description of the difference between the OpenACC parallel and kernels constructs and how they map to NVIDIA devices.
Read the Full Story.