Files
composable_kernel/include/ck_tile/ops/gemm/kernel
Aviral Goel 7d604924b2 [rocm-libraries] ROCm/rocm-libraries#8531 (commit 6851169)
[CK_TILE] Use launched block size for GEMM occupancy query
 (#8531)

The grouped, grouped-quant, and stream-k GEMM kernels were asking the
occupancy query about `kBlockSize`, but on wave32 (gfx1250) we actually
launch `kBlockSize/2`. So the occupancy came back too low and the
persistent/stream-k grid ended up undersized.

Just pass `BlockSize().x` like the universal and flatmm kernels already
do. No-op on wave64.

Verified it builds + runs correctly on gfx1250 (grouped gemm) and builds
on gfx950 (stream-k).
2026-07-01 16:42:59 +00:00
..