mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* change cshuffle size
* added mxfp4 moe async buffer loading without B preshuffle
* added mx moe B shuffling + scale shuffling (async loads)
* minor fix
---------
Co-authored-by: mtgu0705 <mtgu@amd.com>
[ROCm/composable_kernel commit: 7998ae8969]