Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from May 14 to June 3 2022.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Community Events

LLVM and Clang

Discussions

  • The LLVM GPU Working Group met on May 20 and discussed: the impact of opaque pointer os GPU code generation targets, memory kind operand on fence instructions, new driver linking CUDA upstream with LTO and applying to HIP / SYCL, and address-space-aware alias analysis. The meeting notes are available in the agenda doc.
  • The discussion on ‘Setting version, environment, and extensions for SPIR-V target’ continued. Jaebaek Seo and Nicolai Hähnle mentioned the problem of features enabled by either vendor-specific and cross-vendor extensions.

Commits

  • The SPIR-V target now supports setting the SPIR-V version via target triple, e.g. spirv64v1.4-unknown-unknown. D124776
  • Added the -Xoffload-linker flag to forward linker arguments to the device link job. D126226
  • Added the --offload-link flag to instruct the driver to use the linker-wrapper. D126398
  • The new offloading driver now only builds the offloading actions with a source input. D125705
  • Introduced a fallback scheme for externalizing static device variables in CUDA / HIP. D125904
  • New HLSL and DirectX features:
    • Vector types were enabled for HLSL. D125052
    • HLSL target profile parsing was fixed. D125585
    • Added DXContainer (object file format for DXIL shaders) parsing code. D124804
    • Initial support for YAML representation of DXContainer files. D123944
  • Added new AMDGPU intrinsics: llvm.amdgcn.{raw|struct}.buffer.load.lds and llvm.amdgcn.global.load.lds intrinsic. D124884, D125279

MLIR

Discussions

Commits

  • GpuParallelLoopMapping was exposed as a non-test pass. D126199
  • Fixed capability check for 64-bit element types in the SPIR-V dialect. D12656
  • Added an option to lower NvGpu dialect ops during the VectorToGPU conversion pass. D122940

OpenMP (Target Offloading)

Discussions

  • A discussion on whether or not OpenMP should support updates to static variables on the device.

Commits

  • Added a new target to build the device runtime as a static library. D125315
  • Changed device-LTO to use a static library version of the OpenMP runtime. D125333
  • atomic compare and atomic compare capture are now supported. D120290, D120007, D118632, D120200, D116261, D118547, D116637. Only integer variables are supported for now, floating point will be supported soon.

External Compilers

LLPC