Issue #40
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from August 6 to August 19, 2022.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Community Events
LLVM and Clang
Discussions
- Sameer Sahasrabuddhe gave an overview of his Uniformity Analysis RFC during the LLVM GPU Working Group Meeting on August 19. The presentation was recorded. Sameer is looking for reviewers and clarified that ‘the current intent of the RFC is to invite attention to how we are defining convergence and uniformity in irreducible CFGs’.
Commits
- Added support for capabilities and extensions to the SPIR-V backend. D131221
- A solver for non-trivial
SchedGroup
pipelines was added to the AMDGPU backend. The solver consists of an exponential-time exact algorithm and a greedy algorithm, with user-controlled knobs to select between the two. D130797
MLIR
Discussions
- Jakub Kuderski posted an RFC on adding integer add-with-carry op to the
arith
dialect. The goal is to support emulating 64-bit operations with 32-bit ones when lowering to both SPIR-V and LLVM dialects. An example of targets without 64-bit instructions are most mobile GPUs. The initial set of patches landed. D131893, D131908
Commits
- Added support for Intel joint matrix ops to the SPIR-V dialect. D131586
- Made SPIR-V conversion passes
OperationPass
es, so that downstream compilers can put them in a nested pass manager. D131591 - Refactored SPIR-V memory space mappings to better serve multiple client APIs (incl., Vulkan and OpenCL). D131409, D131410, D131411
OpenMP (Target Offloading)
Discussions
- Tom Deakin noticed a segfault at the end of
taskgroup
when waiting for all tasks, when targeting nvptx64. There are no replies at the time of writing.
Commits
- Fixed detecting CUDA compute capability between images (e.g.,
sm_60
image andsm_62
GPU). D131567 libelf
will be replaced with LLVM’s ELF handling. D131401dl
will be replaced with LLVM’s dynamic library handling. D131401- Fixed the driver crashing when trying to output multiple files in device-only mode. D132248
- Adding features to extract images from offloading images with
clang-offload-packager
. D129507