In this video from SC17 in Denver, Michael Klemm from the OpenMP ARB describes how the OpenMP programming community is moving forward to new levels of scalable performance. The OpenMP Architecture Review Board (ARB) is seeking feedback on the newly released Technical Report 6, the second preview for the future OpenMP API, version 5.0.
Technical Report 6 demonstrates the importance of user feedback to the OpenMP specification,” says Bronis R. de Supinski, the Chair of the OpenMP Language Committee. “Users have indicated that several features are vitally important to them, such as multilevel memory support, deep copy, easy access to unified shared memory and a descriptive loop construct. As a result of that feedback, OpenMP 5.0 will include all of these major additions.”
New features in TR6 include:
- Support for multilevel memory systems: OpenMP 5.0 will include memory allocation mechanisms that intuitively place data in different kinds of memories;
- Support for deep copy of complex data structures: TR6 adds user-defined mappers, a composable mechanism that easily allows object-oriented data structures to be copied correctly to accelerator devices;
- Additional enhancements to OpenMP device constructs: Other major additions to OpenMP support of accelerators include a mechanism to require unified shared memory support, the ability to use device-specific function implementations, better control of implicit data mappings and the ability to override device offload at runtime;
- Support for descriptive loop optimizations: The concurrent construct asserts that all iterations of the associated loop nest may be executed concurrently in any order, which will enable many implementation-specific compiler optimizations;
- Support for improved debugging of OpenMP applications: TR6 adds OMPD, a third-party tool interface that enables intuitive debugging of OpenMP code and complements OMPT, the first-party tool interface that was added in TR4 to support deeper analysis of OpenMP performance;
- New forms of task dependencies: New mechanisms support (1) task sets that require mutual exclusion within the set and ordering with respect to other tasks and (2) tasks with dynamically determined dependency sets;
- Greater memory model flexibility: OpenMP 5.0 will include support for acquire/release semantics, which will allow optimization of low-level memory synchronization activities.
TR6 also improves several features added in TR4 such as OMPT and task reductions.