Introduces the new partitioner to implement the reduction StreamK kernel. (#3107)

* Introduces the new partitioner to implement the reduction StreamK kernel

* Add more doc text to functions

* Add persistent-dp option to streamk example

* Update example/ck_tile/40_streamk_gemm/README.md

[ROCm/composable_kernel commit: 5abe4109e0]
This commit is contained in:
Cong Ma
2025-11-04 10:32:17 -07:00
committed by GitHub
parent 1a8f824938
commit 0343c4e1fe
8 changed files with 298 additions and 75 deletions

View File

@@ -110,6 +110,10 @@ CK_TILE_HOST double timing_loop_impl(TimerType timer,
{
for(int i = 0; i < s.cold_niters_; i++)
{
if constexpr(!std::is_same_v<PreprocessFunc, std::nullptr_t>)
{
preprocess();
}
callables_func();
}
// Only profile preprocess if it's provided