WMMA support for GEMM reduce (#2823)

Added gemm + reduce instance library for RDNA4. This includes:

- New device implementation running GEMM and reduction kernel
- instances for wmma (xdl parity)
- examples for wmma (xdl parity)
- tests for existing xdl and wmma

[ROCm/composable_kernel commit: b25d4d684a]
This commit is contained in:
Wojciech Laskowski
2025-09-12 21:36:43 +02:00
committed by GitHub
parent 8c0cdebe63
commit f2edb06bb0
27 changed files with 1911 additions and 89 deletions

View File

@@ -248,6 +248,7 @@ add_subdirectory(gemm_universal)
add_subdirectory(gemm_b_scale)
add_subdirectory(gemm_universal_streamk)
add_subdirectory(gemm_reduce)
add_subdirectory(gemm_universal_reduce)
add_subdirectory(batched_gemm)
add_subdirectory(batched_gemm_reduce)
add_subdirectory(batched_gemm_gemm)