Issue #23
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from October 29 to November 11 2021.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Conferences
- 2021 LLVM Developers’ Meeting is on November 17-19. The agenda features a number of GPU-related sessions:
LLVM and Clang
Discussions
- Johannes Doerfert is gathering feedback on starting a ‘GPU working group’ which would focus on ‘development of GPU-specific optimizations, organization of driver and backend code, dealing with issues like convergent functions,’ etc. The thread received a number of positive replies with some ideas for additional topics of interest.
Commits
- Two new architectures,
spirv32
andspirv64
, were ported from the LLVM SPIR-V backend Khronos repository. This initially reuses the target information and code generation from the existing SPIR support. D109144 - The latest supported CUDA version was bumped to 11.5. D113249
- Debug locations will not be added to code inside AMDGPU function prologue as it is entirely compiler-generated. D113100
- The
AMDGPU_Gfx
calling convention now definess[4:29]
SGPR registers as callee-saved. D111637 - CUDA landed support for assumptions predicates for pointer address spaces, e.g.,
__builtin_assume(__isGlobal(p))
. These predicates are portable across Clang and NVCC. D112041 - Special NVPTX registers were marked as reserved to avoid machine verifier errors. D113008
MLIR
Discussions
Commits
- SPIR-V to LLVM conversion now supports
shufflevector
with static indices. D112161 arith.bitcast
,std.br
, andstd.cond_br
to SPIR-V conversions were added. D112819
OpenMP (Target Offloading)
Discussions
Commits
External Compilers
LLPC
- Middle-end now supports the New Pass Manager but still default to the Legacy Pass Manager. LLPC#1459
oneAPI DPC++
CUDA/HIP support
- Added initial support for the Tensorcore matrix extension.
- CUDA and HIP plugins were renamed to align with the SYCL 2020 specification.
- Fixed preprocessor-only compilation mode for the NVPTX target.
SYCL 2020 support
- Implemented property traits.
- Fixed interoperability kernel API returning invalid kernel bundle.
- Improved performance of
sycl::queue::submit()
for commands without dependencies.
Non-standard extensions
- Fixed a bug in SVM gather/scatter emulation on host for ESIMD extension.
- Implemented
no_offset
compile-time accessor property. - Enhanced matrix extension specification with JIT support for AMX and DPAS ISA.
- The
SPV_INTEL_hw_thread_queries
extension support was enabled by default.
Misc
- Release notes for September were published.