Issue #37
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from June 18 to July 8, 2022.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Community Events
- The call for proposals for the 2022 LLVM Developers’ Meeting is open until August 30.
LLVM and Clang
Discussions
- The discussion on representing pointers to special resource types continued on discourse thanks to Joshua Cranmer who put together an RFC. Alex Bradbury noticed similarities to that the problem of supporting WebAssembly GC type support. The discussion is scheduled to continue during the upcoming LLVM GPU Working Group meeting on July 15.
- Nicolai Hähnle proposed to ‘make LLVM play nice(r) when used as a shared library in a plugin setting’. The two primary use cases are plugin dynamic loading/unloading and dynamic linking of the LLVM shared object used by one or more plugins. The initial change stack includes the removal of
MagicStatic
s and unregisteringcl::Option
s upon destruction.
Commits
- A large number of patches for the gfx11 AMDGPU in-development target landed.
--offload-arch=
now supports multiple comma separated values. D128206- Improved the binary handling of the offloading section by adding a new ELF section. D129052
- Introduced
!exclude
metadata to make globals use theSHF_EXCLUDE
section flag to better support the offloading section. D129151 - Added the
--offloading
option tollvm-objdump
to display embedded device code in the offloading section, similar tocuobjdump
. D126904 - Introduced SPIR-V global entity tracking and deduplication infrastructure. D128471
- Added thread/group ID DXIL operations. D127990
- Added support for opaque pointers for
ValueAsMetadata
inDXILBitcodeWriter
. D127705 - Added a new HIP option:
-fhip-kernel-arg-name
. D128022
MLIR
Discussions
- Bruno Cardoso Lopes and Nathan Lanza proposed an MLIR-based Clang IR. The discussion summary clarified the initial target is being a part of the regular compilation pipeline, both for codegen and static analysis. A new MLIR C/C++ Frontend Working Group is being formed.
Commits
- Added a shared memory access optimization pass. D127457
- Defined MLIR wrappers around new MFMA intrinsics. D128079
- Added
--chipset
option to AMDGPUToROCDL. D129228 - Added conversion from
math.round
to SPIR-V GLSL/OpenCL ops. D129236 - Added more comparison directions in
arith.cmpi
to SPIR-V conversion. D128692 - Added InferIntRangeInterface to gpu.launch. D129036
OpenMP (Target Offloading)
Discussions
Commits
atomic compare
andatomic compare capture
now support floating-point variables. D127041, D127042- Fixed the issue that peer-to-peer memory copy on Nvidia GPU doesn’t work D122764.
- Heap2Stack (also used to remove globalized locals) is now loop-aware which often allows placing new
alloca
s in the entry block. This can improve performance and avoid issues withstacksave
/restore
intrinsics introduced by the inliner. commit 1, commit2 - Implemented a unified interface for kernel launches in libomptarget. D128549, D128817
- Improved link times and temporary file handling in the linker wrapper by writing to disk only when necessary. D127246
- Added an extension to
omp variant begin
that mangles function declarations as well. D124624 - Reworked argument handling in the linker wrapper to make adding new arguments easier. commit
External Compilers
LLPC
- Transition to opaque pointers continues. LLPC#1839