mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-11 08:50:17 +00:00
* added MPerBlock=32 for MXFP4 GEMM decode * added two instance for M>128 scenario. * added 1 instance * format --------- Co-authored-by: mtgu0705 <mtgu@amd.com> Co-authored-by: felix <felix.li@amd.com>