Files
composable_kernel/include/ck_tile/ops/smoothquant
linqunAMD 2713d3df13 [CK_TILE][REGRESSION] Correct blockSize in Generic2dBlockShape (c254f… (#2837)
* [CK_TILE][REGRESSION] Correct blockSize in Generic2dBlockShape (797b107dd90ce )

WarpPerBlock_M * WarpPerBlock_N are not equal with ThreadPerBlock_M * ThreadPerBlock_N /warpSize. we should calculate BlockSize from WarpPerBlock_M * WarpPerBlock_N

To compatible with wave32, function GetBlockSize is added to calculate correct size in host side.

* fix blocksize for all kernel related with generic2dblockshap

* remove constexpr for blocks

[ROCm/composable_kernel commit: b7a806f244]
2025-09-16 08:47:55 -07:00
..
2025-01-22 17:34:27 +08:00