Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from July 23 to August 5 2021.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Conferences

LLVM and Clang

Discussions

  • Luke Kenneth Casson Leighton posted an RFC: “Vector/SIMD ISA Context Abstraction”. Luke is working on SVP64 Cray-like Vector Extensions for the Power ISA, which is being designed for Hybrid CPU, VPU and 3D GPU workloads. One of the problems mentioned is that some ISA designs may lead to combinatorial explosion in the number of intrinsics, which can be avoided by “separating out ‘scalar base’ from ‘augmentation’ throughout the IR”. Renato Golin replied that, historically, LLVM tried to keep as many instructions as native IR as possible to avoid the explosion of intrinsics. However, intrinsics tend to reduce the number of program instructions, so there needs to be some balance.
  • 席致寧 asked about a quick way to add a new instruction to generated PTX files, without having to implement full support for the new instruction in the backend. There are no replies as of writing.

Commits

  • HIP switched to using DWARF version 5 by default. D107190
  • It is now possible to force-enable MemCpyOpt with a new LLVM flag -enable-memcpyopt-without-libcalls. For now, only the CUDA frontend opts into it, to better exercise this optimization. D106401
  • A new Attributor pass for deducing AMDGPU-specific attributes was added. D104997
  • The NVPTX matrix operation intrinsics were extended with the ldmatrix.sync.aligned warp-level matrix load instructions introduced in PTX 6.5. D107046
  • Clang learned to preserve ASAN library functions when targeting HIP. D106315
  • A number of GlobalISel enhancements for AMDGPU.

MLIR

Discussions

Commits

  • A populateGpuToLLVMConversionPatterns entry point is added for collecting all LLVM GPU to LLVM conversion patterns. D107218
  • Two boolean loading/storing issues were fixed in SPIR-V conversion.
  • A few issues in the SPIR-V module combiner were fixed. D106886
  • MemRef/Math to SPIR-V conversions are split into their own directories and files.

OpenMP (Target Offloading)

Discussions

Commits

External Compilers

LLPC

Mesa