mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-19 22:39:03 +00:00
fix transpose_vectors logic for 2x2 8-bit tiles
add a test which goes through this code path.
factor out constexpr'd cases into smaller functions.
add inline docs about the data movement
impact: gemms with 8-bit non-rcr inputs on gfx942