Issue #5
Welcome to the fifth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from January 22 to February 4 2021.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Conference Talks
- Alyssa Rosenzweig published a second blog post in the series dedicated to investigating the Apple M1 GPU. The project reached a new milestone: being able to draw a triangle with some hand-written machine code. The code is available on GitHub.
LLVM and Clang
Discussions
- Sameer Sahasrabuddhe is looking into enabling divergence analysis in the new pass manager and its interaction with loop unswitching passes.
- Lowering of memory intrinsics (
memcpy
,memmove
,memset
) will be moved fromCombiner
toLegalizer
inGlobalISel
. This is motivated by the needs of the AMDGPU backend.
Commits
- A new intrinsic
llvm.set.rounding
for setting floating point rounding mode has finally landed. AMDGPU targets support changing the rounding mode at runtime and can make use of it. - Fixes around CUDA/HIP static variables in host, device, and global functions.
- (In-review) Initial plumbing for CUDA sm_86 GPU targets.
MLIR
Discussions
- Weiwei expressed interests in contributing to SPIR-V dialect for graphics usage and started sending out patches for it.
Commits
- Integration tests using CUDA runner are added to demonstrate async lowering.
- Serialization and deserialization for image types are now supported in the SPIR-V dialect.
- A few new patterns were added to lower vector ops to their SPIR-V counterparts.
OpenMP (Target Offloading)
Discussions
Commits
- Initial AMDGPU offloading toolchain for OpenMP offloading has landed.
External Compilers
LLPC
- The standalone SPIR-V to GCN compiler
amdllpc
can now compile individual shaders into relocatable ELF files. These ELFs can be then bundled as Vulkan Pipeline Cache files, all offline, i.e., on a system without a GPU. The produced caches cannot be loaded by the AMDVLK driver yet.