mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-06-29 19:28:33 +00:00
[rocm-libraries] ROCm/rocm-libraries#8259 (commit df03f10)
Add cluster launch in test ck_tile mx gemm tdm wmma ## Motivation Add cluster launch test in test_ck_tile_mx_gemm_pipeline_tdm_wmma on gfx1250, so that we can check the performance on gfx1250 hardware. ## Technical Details Added Out-of-bounds guard in RunGemm of MxGemmKernel to skip blocks padded by cluster alignment. Add ClusterEnable/ClusterDisable aliases and extend the tuple in test_mx_gemm_pipeline_kernel_types.hpp by adding two kernel types with ClusterEnable for F8 CompTDMV1 and CompTDMV2 respectively. The existing F4 non-ClusterLaunch kernel types have issue to be fixed, so this PR does not include F4 cases. Read ClusterLaunch from the tuple in test_mx_gemm_pipeline_util.hpp. Update invoke_mx_gemm to branch on ClusterLaunch, including Add cluster size constants, Switch GemmShape type, TilePartitioner type, and the kernel launch call. ## Test Plan Tested the changes on gfx1250 FFM. ## Test Result The added kernel types (instances) passed the tests on gfx1250 FFM. ## Submission Checklist - [x ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
committed by
assistant-librarian[bot]
parent
359f664b25
commit
276863ca87
@@ -231,6 +231,13 @@ struct MxGemmKernel
|
||||
bs_scale_ptr[i] = reinterpret_cast<const int32_t*>(kargs.bs_scale_ptr[i]);
|
||||
});
|
||||
|
||||
// cluster launch pads grid to cluster boundaries; skip out-of-bound blocks
|
||||
if constexpr(BaseKernel::ClusterLaunch)
|
||||
{
|
||||
if(block_idx_m >= kargs.M || block_idx_n >= kargs.N)
|
||||
return;
|
||||
}
|
||||
|
||||
const auto& as_block_window = BaseKernel::MakeABlockWindows(
|
||||
as_ptr, kargs, splitk_batch_offset.splitted_k, block_idx_m);
|
||||
const auto& bs_block_window = BaseKernel::MakeBBlockWindows(
|
||||
|
||||
Reference in New Issue
Block a user