Issue #5

Welcome to the fifth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from January 22 to February 4 2021.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Conference Talks

Alyssa Rosenzweig published a second blog post in the series dedicated to investigating the Apple M1 GPU. The project reached a new milestone: being able to draw a triangle with some hand-written machine code. The code is available on GitHub.

LLVM and Clang

Discussions

Sameer Sahasrabuddhe is looking into enabling divergence analysis in the new pass manager and its interaction with loop unswitching passes.
Lowering of memory intrinsics (memcpy, memmove, memset) will be moved from Combiner to Legalizer in GlobalISel. This is motivated by the needs of the AMDGPU backend.

Commits

A new intrinsic llvm.set.rounding for setting floating point rounding mode has finally landed. AMDGPU targets support changing the rounding mode at runtime and can make use of it.
Fixes around CUDA/HIP static variables in host, device, and global functions.
(In-review) Initial plumbing for CUDA sm_86 GPU targets.

MLIR

Discussions

Weiwei expressed interests in contributing to SPIR-V dialect for graphics usage and started sending out patches for it.

Commits

Integration tests using CUDA runner are added to demonstrate async lowering.
Serialization and deserialization for image types are now supported in the SPIR-V dialect.
A few new patterns were added to lower vector ops to their SPIR-V counterparts.

OpenMP (Target Offloading)

Discussions

Commits

Initial AMDGPU offloading toolchain for OpenMP offloading has landed.

External Compilers

LLPC

The standalone SPIR-V to GCN compiler amdllpc can now compile individual shaders into relocatable ELF files. These ELFs can be then bundled as Vulkan Pipeline Cache files, all offline, i.e., on a system without a GPU. The produced caches cannot be loaded by the AMDVLK driver yet.

Issue #5

Industry News and Conference Talks

LLVM and Clang

Discussions

Commits

MLIR

Discussions

Commits

OpenMP (Target Offloading)

Discussions

Commits

External Compilers

LLPC

Mesa

SYCL