[CK][CK Tile] Fix kbatch check in grouped conv and gemm kernels (#5555)

## Motivation

Fix kbatch check in grouped conv and gemm kernels, allow tails for
kbatch.

## Technical Details

Round up K / Kperxdl and divide it by Kbatch to allow tail for K.

## Test Plan

test_grouped_convnd_bwd_weight_tile

## Test Result

passed locally

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Bartłomiej Kocot
2026-03-21 23:55:24 +01:00
committed by GitHub
parent dd05904d20
commit 0b7ae0ac65
4 changed files with 10 additions and 7 deletions

View File

@@ -418,7 +418,8 @@ struct UniversalGemmKernel
}
}
if(kargs.K < GemmPipeline::BlockGemmShape::WarpTile::at(number<2>{}) * kargs.k_batch)
if(integer_divide_ceil(kargs.K, GemmPipeline::BlockGemmShape::WarpTile::at(number<2>{})) <
kargs.k_batch)
{
if(ck_tile::EnvIsEnabled(CK_TILE_ENV(CK_TILE_LOGGING)))
{

View File

@@ -574,7 +574,9 @@ struct GroupedConvolutionBackwardWeightKernel
}
}
if(kargs.GemmK < TilePartitioner::BlockGemmShape::WarpTile::at(number<2>{}) * kargs.k_batch)
if(integer_divide_ceil(kargs.GemmK,
TilePartitioner::BlockGemmShape::WarpTile::at(number<2>{})) <
kargs.k_batch)
{
LogInfo("KBatch is too large, part of GPU wouldn't be utilized! GemmK: ",
kargs.GemmK,