Clacc – Open Source OpenACC Compiler and Source Code Translation Project

By Rob Farber, contributing writer for the Exascale Computing Project

Clacc is a Software Technology development effort funded by the US Exascale Computing Project (ECP) PROTEAS-TUNE project to develop production OpenACC compiler support for Clang and the LLVM Compiler Infrastructure Project (LLVM). The Clacc project page notes, “OpenACC support in Clang and LLVM will facilitate the programming of GPUs and other accelerators in DOE applications, and it will provide a popular compiler platform on which to perform research and development for related optimizations and tools (e.g., static analyzers, debuggers, editor extensions).” [i] OpenACC continues to be the second most popular programming model for GPUs on the ORNL Summit supercomputer.

Joel Denny, Computer Scientist at ORNL and member of the ECP Software Technology Development Tools team, observed that “When the Clacc project began, NVIDIA was the dominant OpenACC compiler vendor. The Clacc project was initiated to provide the HPC and scientific communities with a new, production quality, open source OpenACC compiler option.” Denny further observed that “there has been a strong push in DOE toward LLVM. It makes sense to utilize that ecosystem to support DOE and the OpenACC users.” Currently, the Clacc project is focused on feature completeness. Even though compiler-based performance optimizations are not a current focus, preliminary benchmark results show Clacc can deliver acceptable GPU performance.

OpenACC is a relatively new programming standard that was launched in 2010 to provide a portable directive-based programming model for the C, C++, and Fortran computer languages. Jointly developed by Cray, NVIDIA, and PGI, the OpenACC standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems. [iii] [iv] The OpenACC organization notes that a goal of OpenACC is to help the research and developer communities advance science by expanding their accelerated and parallel computing skills.[v]   The addition of OpenACC support to the open source Clang and LLVM projects, described next, leverages the extensive effort these projects have put into the creation of a production quality open source parallel compiler and runtime system over the past number of years in support of the OpenMP standard.

There is a natural synergy between the OpenACC and OpenMP compiler frontends and runtime systems. Of course there are differences, but broadly speaking both OpenMP and OpenACC are directive based standards that provide programming statements called pragmas that programmers utilize to create applications that can use the parallel capabilities of multi-core CPUs and massively parallel accelerators like GPUs. The Clacc project highlights the generality of the Clang compiler and LLVM projects as compiler writers can leverage the work of others when supporting either or both of the OpenACC and OpenMP programming standards.

Leveraging the LLVM Compiler and Toolchain Technologies

The LLVM Compiler Infrastructure Project (LLVM) is an open source collection of compiler and toolchain technologies. Doug Kothe, Director of the US Department of Energy’s (DOE’s) Exascale Computing Project, believes “LLVM compiler technology is becoming the nexus for vendor and community compiler development and evolution.”

LLVM is becoming so prevalent that Johannes Doerfert, a researcher at Argonne National Laboratory, observes, “Numerous companies and organizations are collaborating on LLVM, which is one of the many benefits of using the LLVM compiler infrastructure. LLVM-based compilers from the system manufacturers are common throughout the HPC community,” Doerfert observes. “Improvements in collaboration as well as improvements to LLVM benefit the entire HPC community, including system manufacturers, software suppliers and the end users.

The PROTEAS-TUNE project complements and collaborates with the SOLLVE project, [vi],[vii] which is an ECP effort focused on standardizing HPC features in OpenMP and developing an efficient, portable, and complete implementation in the LLVM compiler framework. [viii]

These projects are possible due to the extremely permissive terms of the LLVM licensing agreement, which gives the HPC community the ability to create and release software using the LLVM compiler infrastructure. This includes profilers, parallel compilers, debuggers, Domain Specific Languages (DSLs) and new programming models. It also means that the HPC community does not have to go through the process of filing a bug report with a compiler or hardware vendor and waiting/hoping for a bug fix. Instead, HPC developers can find and submit fixes to the open-source code base.

Leveraging the Clang Compiler Front End to Support OpenACC

Clang is a compiler front end for the C family of computer languages that include both C and C++. Clang fully supports the OpenMP 4.5 standard. The ECP-funded SOLLVE project is working to bring the features of the OpenMP 5.1 specification to LLVM-based compilers. Analogous to OpenACC, the OpenMP 5.1 specification is intended to strengthen and optimize features that support the handling of accelerators such as GPUs. [ix]

Clacc Provides Two Paths to an Executable Binary

Figure 1: Two paths to create an executable (source: https://csmd.ornl.gov/project/clacc)

A key feature of the Clacc design is to translate OpenACC to OpenMP, which leverages the extensive effort that has already been put into the LLVM OpenMP compiler and runtime support over the past number of years. These two paths to generate an executable are illustrated in Figure 1. Depending on what the programmer wants, Clacc can follow a direct path from source code to the LLVM intermediate representation (CodeGen) or it can be used as a source-to-source translator (RewriteOpenACC) to convert the OpenACC code to OpenMP source code.

There are benefits to both modes:

  • CodeGen: when selected, Clacc will translate the OpenACC source directly to a binary executable. This is similar to the behavior of the NVIDIA and GCC compilers. The programmer only sees the creation of a binary, so they are not exposed to OpenMP which is used internally by Clacc as an intermediate representation.
  • RewriteOpenACC: When used in this mode, Clacc translates OpenACC source to OpenMP source which is then compiled with an OpenMP compiler to generate the executable. This mode has several potential use cases. It is intended to be used for targeting other OpenMP compilers and tools besides upstream Clang.  It is also intended for porting applications.  To better serve these use cases, the source-to-source code translation avoids the preprocessor expansions and loss of comments and formatting that sometimes occur when translating C-like languages.

According to Denny, Clacc currently takes a straightforward approach when mapping OpenACC code to OpenMP be it for internal Clacc use or for later compilation by an OpenMP compiler. He explains that the three levels of OpenACC parallelism (e.g. gang, worker, vector lane) are mapped to the OpenMP equivalents (teams, threads, and SIMD lanes). Denny notes that alternative approaches are also being explored.

Leveraging Profiling Infrastructure

Profiling is a requirement for any parallel programming language – in particular when being developed in support of the various hardware platforms being used, or slated to be used in the DOE complex.

Clacc supports the OpenACC Profiling Interface, a critical component of the OpenACC specification that standardizes an interface that profiling tools and libraries can depend upon across OpenACC implementations.[x]  Such information can be gathered and viewed with powerful tools such as the Tuning and Analysis Utilities (TAU) Performance System;[xi] TAU is also funded by ECP’s PROTEAS-TUNE project.

A recent 2020 IEEE paper[xii] by the Clacc and Tau teams discusses Clacc profiling support for OpenACC in greater detail and presents example visualizations for several SPEC ACCEL OpenACC benchmarks with the TAU performance tool. The paper claims that the associated performance overhead is negligible.

TAU gives HPC programmers the ability to measure performance and see bottlenecks via one profiling tool that works well across a broad spectrum of widely differing HPC systems, architectures, languages, and software/hardware execution models.

Figure 2: The TAU profiling system. (Source: https://www.alcf.anl.gov/sites/default/files/2020-05/CompWorkshop_TAU_2020.pdf.)

TAU can be installed via Spack and is distributed in the Extreme-Scale Scientific Software Stack (E4S). Just install TAU with the target back-end CUDA, ROCm, L0 for OneAPI, etcetera.

Summary

The OpenACC directive-based programming model is designed to provide a simple, yet powerful, approach to accelerators without significant programming effort. The Clacc project is working to bring an open source OpenACC compiler and source code translation capability to the HPC and scientific communities.

Rob Farber is a global technology consultant and author with an extensive background in HPC and in developing machine learning technology that he applies at national laboratories and commercial organizations. Rob can be reached at info@techenablement.com

[i] https://csmd.ornl.gov/project/clacc

[ii] As of 3/1/2021

[iii] https://en.wikipedia.org/wiki/OpenACC

[iv] https://www.elsevier.com/books/parallel-programming-with-openacc/farber/978-0-12-410397-9

[v] https://www.openacc.org/

[vi] https://www.exascaleproject.org/wp-content/uploads/2020/02/ECP_ST_SOLLVE.pdf

[vii] https://github.com/SOLLVE/llvm-project

[viii] https://www.exascaleproject.org/highlight/sollve-openmp-for-hpc-and-exascale/

[ix] https://www.openmp.org/

[x] https://ieeexplore.ieee.org/abstract/document/9308080

[xi] https://www.exascaleproject.org/highlight/ecp-provides-tau-a-cpu-gpu-mpi-profiler-for-all-hpc-and-exascale-machines/

[xii] OpenACC Profiling Support for Clang and LLVM using Clacc and TAU, Camille Coti, Joel E. Denny, Kevin Huck, Seyong Lee, Allen D. Malony, Sameer Shende, and Jeffrey S. Vetter, ProTools, GA, USA (November 2020)