mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* manual control of MAC cluster for improved 2-wave performance
ensure setprio's order; ensure inner loop size >= local read size
synchronize when single mac cluster
* format
* use value field from ck::integral_constant
* roll out inter-wave loop scheduler to c-shuffle gemm variants
will gradually roll out to other applicable device ops when occasional reg spill is resolved
* additional comments
* format
* fix mismatch between inter-wave pipeline and interwave blockwise gemm
* address review feedback
* amend
[ROCm/composable_kernel commit: 76764d8c92]