Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from March 19 to April 1 2021.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Conference Talks

LLVM and Clang

Discussions

  • Discussion on the ‘Abstracting over SSA form IRs to implement generic analyses’ RFC has seen some new activity. Sameer Sahasrabuddhe shared their perspective and identified that the main issue is that LLVM IR/MIR basic blocks do not explicitly track their successors and predecessors. Nicolai Hähnle clarified what the most important decisions are to move the proposal forward. In addition, Nicolai noted that changing the in-memory representation of basic blocks to contain predecessor and successor vectors would allow terminator instruction to refer to those, and potentially result in reduced memory usage.

Commits

MLIR

Discussions

Commits

  • Conversion to NNVM/ROCL now uses a data layout entry to specify the bitwidth for index type.

OpenMP (Target Offloading)

Discussions

  • Nader Al Awar asked about using the -fembed-bitcode Clang option with OpenMP target offload for CUDA. There are no replies as of writing.
  • Asynchronous offloading bugs were discovered and are being discussed on the mailing list and the bugtracker.
  • The device runtime for LLVM 12 shows performance regressions, [1] and [2], that will be addressed in the 12.1 release.
  • A rewrite of the device runtime is being tested right now. The first results look promising with regards to performance and memory usage.
  • Issues with Clang’s device code generation were detected: [1], [2], and will be resolved soon.

Commits

External Compilers

LLPC

Mesa

SYCL