[rocm-libraries] ROCm/rocm-libraries#5050 (commit 033dad7)

[CK TILE] Skip work if any of Grouped GEMM groups M/N/K are
 zero. (#5050)

## Motivation

It's common in MoE workloads that some experts receive zero tokens,
which would result in some of the dimensions equal to zero. Currently we
handle such case only for non-persistent kernels where we have all GEMMs
information beforehand on host - we validate this during creation of
kernel arguments. However for the "dynamic" input path (persistent
kernel) this information is not available before kernel launch. Thus we
have to validate this during kernel execution. The goal is to add this
validation.

## Technical Details

Skip work if any of Grouped GEMM groups M/N/K are zero for persistent
kernel path.

## Test Plan

Add unit-tests which cover "dynamic" inputs with zero dims for
persistent kernel execution path.

## Test Result

All tests pass.

## Submission Checklist

- [ x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Adam Osewski
2026-03-12 13:29:14 +00:00
committed by assistant-librarian[bot]
parent 2c3f9bfa52
commit b09ce811d5
6 changed files with 186 additions and 52 deletions

View File

@@ -1,7 +1,14 @@
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
# Currently ck_tile is only built on gfx9
if(GPU_TARGETS MATCHES "gfx9|gfx11|gfx12")
add_gtest_executable(test_ck_tile_grouped_gemm test_grouped_gemm.cpp)
add_custom_target(test_ck_tile_grouped_gemm)
add_gtest_executable(test_ck_tile_grouped_gemm_f16 test_grouped_gemm_f16.cpp)
add_gtest_executable(test_ck_tile_grouped_gemm_bf16 test_grouped_gemm_bf16.cpp)
add_dependencies(test_ck_tile_grouped_gemm
test_ck_tile_grouped_gemm_f16
test_ck_tile_grouped_gemm_bf16)
endif()