Issue #41
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from August 20 to September 9, 2022.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Community Events
- Lei Zhang posted a blog post about the MLIR Vector Dialect and Patterns.
- The Khronos group released the Mesh Shading extension for Vulkan (
VK_EXT_mesh_shader
).
LLVM and Clang
Discussions
Commits
- HLSL/DirectX-related changes:
- Added support for SPIR-V builtin functions, types, and
ExtInst
selection. D123024, D132648 - Large number of changes improving AMDGPU GFX11 support.
MLIR
Discussions
- Diego Caballero posted an RFC on ‘Vector Masking Representation in MLIR’. The proposal includes a new op,
vector.mask
, and two new interfaces:MaskableOp
andMaskingOp
. - gxiaotian asked about an
gpu.all_reduce
operation among threads in a subgroup, as opposed to a work group. There are no replies at the time of writing.
Commits
- For AMDGPU, defined
mfma
operation. D132956 - For AMDGPU, fixed signed/unsigned comparison for
abid
/cbsz
comparison. D133061 - For NVGPU, added Support for
cp_async_zfill
via inline assembly. D132269 - For SPIR-V, added definitions for non-uniform group ops and supported lowering
gpu.shuffle
to them. D133041, D133054 - For SPIR-V, added patterns and utility functions to help lower ops and map memory space to OpenCL. D132424, D132428
- Introduced more folders for SPIR-V ops and handled more corner cases for lowering vector ops to SPIR-V ops. D133167, D133168, D133183
- For
arith
, added initial patterns to emulate wide integer operations with narrower integers supported by the target. D133135, D133136, DD133137
OpenMP (Target Offloading)
Discussions
- Discussions on whether or not
libomptarget
should guarantee backwards compatibility. D133277 - Discussed improving OpenMP device reductions, which should yield 2x-10x performance once complete.
Commits
- OpenMP 5.2 semantics for absent mapping items was implemented, causing unmapped pointers to behave like device pointers. D133447
- Fixed a bug causing
-fsyntax-only
crashing with the new driver. D133161 - Fixed a bug causing the
omp_get_wtime
function to be optimized out. D133360 - Fixed a bug preventing users from compiling with
assert
in the device. D133594 - Added the ability to extract offloading images from other file types. D132607
External Compilers
LLPC
- Made multiple changes advancing support for mesh shaders.