mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 18:17:44 +00:00
* simplify karg in device/grid split-k op
* fix mk_kn_mn instances
* add more instances
* B2C with 3D grid for KSplit
* Remove unused code.
* Use default B2C (3D grid) in grid gemm v2r4r2.
* Device gemm splitk use B2C map.
* Device GroupedGemmXdlSplitKCShuffle
* Example for GroupedGemm Xdl SplitK
* Introduce Device GroupedGemmSplitK
* Fix updating kbatch size.
* Add instance mk-nk-mn
* Enable set kbatch in profiler.
* Add GGemmSplitK mk-kn-mn instances
* Add more instances & split into multiple files.
* minor fix
* tuning
* clean
* disabled failed instances
* use pipe v2
* Ignore arg on not supported arch.
* fix warning
---------
Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: Adam Osewski <aosewski@amd.com>
Co-authored-by: zjing14 <zhangjing14@gmail.com>
Co-authored-by: Jing Zhang <jizhan@amd.com>
Co-authored-by: root <root@ctr-ubbsmc15.amd.com>
[ROCm/composable_kernel commit: 8bb2bb4a05]