mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 18:17:44 +00:00
* Preshuffle AQ matrix in block scale gemm
* turns the output to fp16. Increase the repetition time.
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com>
[ROCm/composable_kernel commit: 452791a3ba]