mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-04 21:51:28 +00:00
fix transpose_vectors logic for 2x2 8-bit tiles
add a test which goes through this code path.
factor out constexpr'd cases into smaller functions.
add inline docs about the data movement
impact: gemms with 8-bit non-rcr inputs on gfx942