mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* added MPerBlock=32 for MXFP4 GEMM decode
* added two instance for M>128 scenario.
* added 1 instance
* format
---------
Co-authored-by: mtgu0705 <mtgu@amd.com>
Co-authored-by: felix <felix.li@amd.com>
[ROCm/composable_kernel commit: 0198257d79]