Files
composable_kernel/example
Qun Lin c3228aaf0d [ck] support remap 32x32 warp tile to 16x16
This PR remap 32x32 warp tile to 16x16 warp tile for all CK kernels in wave32. the logic is same with ROCm/composable_kernel#3421. and the most change is in device classes.

To reduece the instance build time, VGPR estimation is implemented in ~10 gridwise classes. and to pass all test in CI, several tests are minor adjusted.
2026-01-12 14:30:09 +08:00
..