mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-17 19:40:04 +00:00
Add Gemm instances for performance improvement (#1018)
* improve kpad
* more tuning parameters
* f16_f8_fp16
* cut test time
* add f16_f8_fp16
* add f16_f8_f16
* testing instances for skinny cases
* format
* clean
* add fp16_f8_fp16
* clang-format
* add grouped gemm instalces
* fixed profile grouped_gemm
* clean
* clean
* clean
* clean
* clean
* add missing instance func
* fixed inferface
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
Co-authored-by: root <root@sh5-1e707-rc06-38.mkm.dcgpu>
[ROCm/composable_kernel commit: 98fd41f597]
This commit is contained in:
@@ -108,6 +108,10 @@ TEST_F(TestGGemmSplitKInterface_MKNKMN, KLoops)
|
||||
|
||||
// kloops % 2
|
||||
Ks = std::vector<int>{256, 512, 320, 768};
|
||||
EXPECT_FALSE(
|
||||
DefaultGGemmInstance{}.IsSupported(Ms, Ns, Ks, StrideAs, StrideBs, StrideCs, kbatch));
|
||||
|
||||
Ks = std::vector<int>{256, 512, 384, 768};
|
||||
EXPECT_TRUE(
|
||||
DefaultGGemmInstance{}.IsSupported(Ms, Ns, Ks, StrideAs, StrideBs, StrideCs, kbatch));
|
||||
|
||||
|
||||
Reference in New Issue
Block a user