Issue #13
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from May 21 to June 3 2021.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Conference Talks
- LLVM-CTH: The Second Workshop on LLVM Compiler and Tools for HPC features multiple GPU centric talks.
LLVM and Clang
Discussions
- Sameer Sahasrabuddhe posted an RFC for making function calls
convergent
by default. On GPU targets, “A convergent operation involves inter-thread communication or synchronization that occurs outside of the memory model, where the set of threads which participate in communication is implicitly affected by control flow”. The RFC proposed to conservatively make callsconvergent
by default, and to add thenoconvergent
attribute to indicate that a call is not sensitive to the set of threads executing any dynamic instance of that call. This way, “Frontends are no longer required to add aconvergent
ornoconvergent
attribute if all they seek is correct execution on GPUs”. - ggllvm is seeking help with OpenCL kernel compilation for AMDGPU on Discourse. The only suggestion so far did not resolve their problem.
Commits
- CUDA/HIP compilation now defaults to C++14 instead of C++98. This change makes it match the default C++ standard version used by Clang and nvcc, while GCC already defaults to C++17.
- HIP now supports ThinLTO compilation for the AMDGPU target. This is controlled by new Clang flags:
-[no-]offload-lto
and-foffload-lto=[thin,full]
. - HIP device library detection has been fixed for the Spack package manager.
MLIR
Discussions
- Thomas Raoux is asking about using the new
GPU_MMAMatrix
type with Standard ops without having the Standard dialect depend on the GPU dialect. Geoffrey Martin-Noble pointed out that “standard ops only operate on (a subset of) builtin types, but perhaps we could consider turningShapedType
into an interface”.
Commits
- GPU MMA store ops are relaxed to allow chain of MMA ops.
- SPIR-V dialect starts to see better assembly result names.
spv.module
now hasSingleBlock
andNoTerminator
traits;spv.endmodule
is retired.
OpenMP (Target Offloading)
Discussions
- An OpenMP offloading buildbot is available in the LLVM buildbot staging area.
- OMPD upstreaming continues with discussions: D100181.
- The patchsets for reworked globalization (D97680 and following) and custom state machines (D101976 and following) are almost ready.
- AMD are reworking the
__shared__
/__local
implementation for higher efficiency. - Linking of static archives with offload code is being worked on again: D93525.
Commits
- We are replacing libelf with the ELF facilities in LLVM core to cut down on dependencies and work towards offloading on non-Linux platforms: D103545.
- Multiple refactoring commits to the amdgpu plugin, mainly removing global state (D103600, D103509, …).
- Various bug fixes: D103642, D103646.
External Compilers
LLPC
- The number of allowed InstCombine iterations was tightened. This improves compilation times by 3% on average.
Mesa
- llvmpipe, a software OpenGL renderer implementation, dropped support for old LLVM versions and now requires at least LLVM 3.9.