Commit Graph

14 Commits

Author SHA1 Message Date
Johannes Graner
17b93a34b3 [CK Profiler] Instance selection for grouped conv profilers (#4800)
## Motivation

This PR adds instance selection support for ckProfiler grouped
convolution operations (forward, backward data, backward weight),
allowing users to run specific kernel instances rather than sweeping all
available instances.

When profiling or debugging convolution kernels, users often need to
test specific kernel configurations without running the full instance
sweep. This is particularly useful for:
- Debugging a specific failing instance
- Profiling a known-best configuration
- Quick validation during development

## Technical Details

**Features added**:
- `--instance <id>` flag to run only the N-th valid instance (0-indexed)
- `--list-instances` flag to list all valid instances without running
any kernels
- Named arguments can appear anywhere on the command line
- Best instance index is now printed with results for reference
- Python script support via `-ii` / `--instance_index` arguments

**Design decisions**:
- Named arguments (`--instance`, `--list-instances`) instead of
positional to avoid conflicts with existing parameters
- Instance index refers to the N-th valid instance (0-indexed), not the
global instance index
- Auto-disable verification when `--list-instances` is used for fast
enumeration
- Shared utilities in `profiler_arg_utils.hpp` to deduplicate parsing
logic

## Test Plan

Manual testing with various scenarios:

List all valid instances:
```bash
./bin/ckProfiler grouped_conv_fwd <usual args> --list-instances
```

Run only instance 5:
```bash
./bin/ckProfiler grouped_conv_fwd <usual args> --instance 5
```

Test cases:
- Single instance selection
- List instances mode
- Out-of-bounds instance index (verified warning messages)
- No instance flag (runs all instances - default behavior)
- All three operations (fwd, bwd_data, bwd_weight)

## Test Result

All test scenarios passed:
- Instance selection correctly filters kernel executions
- List mode enumerates valid instances without running kernels
- Invalid indices produce appropriate warnings without crashing
- Default behavior (all instances) unchanged when flags not provided
- Consistent behavior across all three grouped convolution operations

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
2026-03-03 07:31:47 -08:00
Aviral Goel
9ec4b67288 chore(copyright): update copyright header for script directory (#3184)
* chore(copyright): update copyright header for tile_engine directory

* chore(copyright): update copyright header for script directory

---------

Co-authored-by: Vidyasagar Ananthan <vanantha@amd.com>

[ROCm/composable_kernel commit: ab68c9d384]
2025-11-11 11:26:01 -08:00
Bartłomiej Kocot
1d9ec09cf2 Grouped Conv Bwd Data out index calculation optimizations (#2917)
* Grouped Conv Bwd Data index calculation optimizations

* fixes

* refactor instances

* gfx12 fixes

* temporary disable splitK for gfx12

[ROCm/composable_kernel commit: 5477811670]
2025-09-29 15:59:11 +02:00
Bartłomiej Kocot
b5b0797513 Grouped convolution forward with clamp (#2334)
* Grouped convolution forward with clamp

* Optimize clamp

* unary fixes

* test gk bias

* Revert "test gk bias"

This reverts commit 8e42e29d7b.

* Revert "Revert "test gk bias""

This reverts commit e73c0550ce.

* workaround comment

[ROCm/composable_kernel commit: f6c2ff9dce]
2025-06-16 15:36:53 +02:00
Bartłomiej Kocot
7942bb905b Integrate universal gemm with conv bwd data and add SplitK (#1315)
* Integrate universal gemm with conv bwd data

* Fix multi d kernel

* Add splitK support

* instances refactor

* instances refactor

* refactor

* fixeS

* fixes

* 16x16 instnaces

* Fixes

* Fix

* Fix

* Fix

* Fix

* Fix

* Fixes

* fix

* fix

[ROCm/composable_kernel commit: 4094ad158a]
2025-04-28 23:54:49 +02:00
Bartłomiej Kocot
49565538fe Add support for GKCYX grouped conv weight (#2023)
* Grouped conv bwd weight GKCYX support

* fix and changelog

* fix

* fix

* fixes

* comments

* fix

[ROCm/composable_kernel commit: 2ccf914888]
2025-04-02 23:59:49 +02:00
Bartłomiej Kocot
50fb390d6f Grouped conv backward data GKCYX support (#2029)
* Grouped conv backward data GKCYX support

* profiler

* Converter

* split instances

[ROCm/composable_kernel commit: 8c0ab61ece]
2025-04-01 13:24:38 -07:00
Bartłomiej Kocot
f967fd7296 Add support for GKCYX grouped conv fwd (#2015)
* Add support for GKCYX grouped conv fwd

* fixes

* fix

* changelog

* Fixes

[ROCm/composable_kernel commit: 54c81a1fcf]
2025-03-26 21:13:38 +01:00
Bartłomiej Kocot
36f9cc5fb0 Grouped conv bwd data NGCHW (#1967)
* Grouped conv bwd data NGCHW

* fixes

* fix

* Improvements

* Fix

* Fix

* add client example

[ROCm/composable_kernel commit: c2e4898b4b]
2025-03-17 13:32:00 +01:00
Bartłomiej Kocot
5f8fefae24 Enable grouped conv bwd wei bf16 NGCHW (#1589)
* Enable grouped conv bwd wei bf16 NGCHW

* fixes

* fixes

* Fixes

* fixes

* fixes

* Fixes

[ROCm/composable_kernel commit: 82fc53835a]
2024-10-22 16:18:28 +02:00
Bartłomiej Kocot
9d56280a0a Add support for NGCHW in grouped conv fwd (#1499)
* Support NGCHW in grouped conv fwd

* Remove not needed variable

* Fixes

[ROCm/composable_kernel commit: 4ba52b35dc]
2024-09-20 10:45:46 +02:00
Bartłomiej Kocot
950165c6fb Add support for NGCHW in grouped conv bwd wei (#1491)
* Add support for NGCHW in grouped conv bwd wei

* Comments fixes

* navi fixes

* Update function names

[ROCm/composable_kernel commit: 73b67f290f]
2024-09-03 10:52:03 +02:00
Bartłomiej Kocot
c7b7771a91 Convert MIOpen driver to ckProfiler script typos fix (#1476)
[ROCm/composable_kernel commit: dc82daa86e]
2024-08-20 19:04:14 +02:00
Bartłomiej Kocot
6e6277ca02 Add script to convert MIOpen driver to ckProfiler (#1472)
* Add script to convert MIOpen driver to ckProfiler

* Fix

[ROCm/composable_kernel commit: a6a7966505]
2024-08-19 08:24:56 -07:00