Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from February 19 to March 4 2021.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Conference Talks

LLVM and Clang

Discussions

  • Konrad Trifunovic of Intel proposes to upstream a SPIR-V backend for LLVM. The implementation would be primarily based on GlobalISel and produce the kernel flavor of SPIR-V (for OpenCL), with a future possibility of being extended to the shader flavor (for Vulkan). A long discussion followed the RFC, mostly revolving around the question whether this should be a new LLVM backend, or implemented leveraging MLIR, and how to eventually unify to avoid duplication. The existing SPIR-V support in MLIR targets mostly the shader flavor, with community interests and contributions to grow support for kernel favor too. The big hurdle for reusing the implementation is that it’s not currently possible to directly emit MLIR from the LLVM infrastructure and Clang.
  • Sebastian Neubauer of AMD described the current state of register spilling, function calls, and related problems in SIMT targets, e.g., AMDGPU. These start with LLVM IR expressing a single execution thread, instead of multiple threads executing the same instructions in lockstep. In Machine IR, multiple execution threads are represented implicitly. This causes issues for operations that involve more than a single vector lane. Sebastian suggests that the long term solution for some of the problems would be tracking the live ranges of VGPR registers of other lanes.

Commits

MLIR

Discussions

Commits

  • A few patches landed into the SPIR-V dialect to improve op naming consistency.

OpenMP (Target Offloading)

Discussions

  • We are working towards the optimization of “globalized” locals in OpenMP target regions (D97680), this is supposed to get us -fopenmp-cuda-mode performance while preserving OpenMP semantics.

Commits

External Compilers

LLPC

Mesa

SYCL