mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 02:02:46 +00:00
- Add support for direct store in epilogue instead of cshuffle
- Add padding support for wave transfer without transpose
- Add wave transfer with interleaved layout to support direct store
- Enable new functionalities on GEMMs
- Add optional new functionality support for grouped convolution fwd
- Add some fast instances for grouped convolution fwd with new functionalities (proper tuning needed)
[ROCm/composable_kernel commit: 693ff3bbb3]