Michael Wolfe from the Portland Group has posted a technical article on Optimizing Data Movement in the PGI Accelerator Programming Model. Introduced in 2009, this directive-based model targets NVIDIA GPUs.
Tuning and optimizing data traffic between the host and GPU has been, and continues to be important to acheiving maximum benefit from the massive performance of the GPU. This is true regardless of which model you use to program the GPU. We have been adding features to the PGI Accelerator programming model and the compilers themselves to allow you to manage and tune the data traffic more carefully. For PGI Accelerator Fortran users, the mirror and reflected directives let you effectively extend the range of a data region across procedure boundaries. The reflected directive will be available in PGI Accelerator C later this year. You can also combine CUDA Fortran extensions with PGI Accelerator Fortran to manually control data allocation on the GPU and data movement between the GPU and host while preserving the productivity benefits of PGI Accelerator directives for programming. You will soon be able to do the same for PGI Accelerator C as well.
Read the Full Story.