[CK][CK Tile] Improvements for grouped conv fwd tile profiling (#5114)

## Motivation

Improve profiling for grouped convolution forward for better comparison
between CK and CK Tile
## Technical Details

- Include preprocessing time for ck tile
- Add flush cache for conv fwd profiler
- Switch configs to builder reflect
- Add KPerXdl deduce
- Add non-grouped ported instances

## Test Plan

test_grouped_convnd_fwd_tile

## Test Result

pass

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

AICK-786
This commit is contained in:
Bartłomiej Kocot
2026-03-11 23:38:15 +01:00
committed by GitHub
parent 622122155a
commit 1972d39410
24 changed files with 2375 additions and 1874 deletions

View File

@@ -97,7 +97,16 @@ int call_profiler(const ckt::Args<SIGNATURE>& args, bool time_kernel)
std::string op_name;
bool valid;
std::tie(valid, avg_time, op_name) = ckp::run_grouped_conv_forward_tile_algs(
args, inputs.get(), outputs.get(), ck_tile::stream_config{nullptr, time_kernel, 0, 5, 50});
args,
inputs.get(),
outputs.get(),
ck_tile::stream_config{nullptr,
time_kernel,
0 /*log_level*/,
5 /*cold_iters*/,
50 /*nrepeat_*/,
true /*is_gpu_timer_*/,
time_kernel /*flush_cache*/});
if(time_kernel)
{
std::cout << "Best configuration parameters:" << "\nname: " << op_name