Aviral Goel
2ff7ac5abc
CK: Remove 41 commented-out dead code blocks (~200 lines) ( #6302 )
...
Depends on #6300
## Summary
Remove 41 commented-out code blocks across 33 files in Composable
Kernel, totaling ~200 lines.
Identified using an automated dead code scanning skill (`ck-dead-code`)
with a calibrated two-stage pipeline:
1. **Pre-filter**: Keyword-based scan found 1,338 `//`-commented blocks.
Calibrated heuristics (trained on 50-sample expert classification)
reduced to 89 high-confidence candidates — 93% noise reduction.
2. **Expert triage**: LLM expert classified each block in context as
CODE_REMOVE, CODE_KEEP, or NOT_CODE.
| Classification | Count |
|---------------|-------|
| Removed (this PR) | 41 |
| Kept (debug helpers, alt configs, reference impls) | 32 |
| Not code (false positives) | 16 |
Removed blocks include: superseded implementations, old test data,
abandoned stubs, unreachable code, and buggy dead code.
2026-04-10 11:17:11 -04:00
Bartłomiej Kocot
1972d39410
[CK][CK Tile] Improvements for grouped conv fwd tile profiling ( #5114 )
...
## Motivation
Improve profiling for grouped convolution forward for better comparison
between CK and CK Tile
## Technical Details
- Include preprocessing time for ck tile
- Add flush cache for conv fwd profiler
- Switch configs to builder reflect
- Add KPerXdl deduce
- Add non-grouped ported instances
## Test Plan
test_grouped_convnd_fwd_tile
## Test Result
pass
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests .
AICK-786
2026-03-11 23:38:15 +01:00
Aviral Goel
0cfa802f89
chore(copyright): update copyright header for include directory ( #3224 )
...
* chore(copyright): update copyright header for tile_engine directory
* chore(copyright): update copyright header for script directory
* chore(copyright): update copyright header for test_data directory
* chore(copyright): update copyright header for python directory
* chore(copyright): update copyright header for profiler directory
* chore(copyright): update copyright header for library directory
* chore(copyright): update copyright header for include directory
[ROCm/composable_kernel commit: f5ac3ee359 ]
2025-11-18 10:17:18 -08:00
Max Podkorytov
a1681b077e
[CK][host] limit the rotating count to prevent oom ( #3089 )
...
* [CK][host] limit the rotating count to prevent oom
* add numeric header for accumulate
[ROCm/composable_kernel commit: f39626fcf7 ]
2025-10-24 08:55:54 -07:00
Enrico Degregori
12225ce645
Wmma support for multiple ABD GEMM ( #2803 )
...
* multi_abd wmma support:
- Add multiple A and B support to multiple D implementation (gridwise level)
- Add multi_abd GEMM (device level)
- Add instances (xdl parity)
- Add tests (both xdl and wmma)
- Add examples
- Add ckProfiler support (both xdl and wmma)
* Fix bug in device print function
* Fix unused template parameter
* Fix batched gemm for multiABD gridwise implementation
* Fix gemm_universal_reduce with multiABDs gridwise implementation
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
[ROCm/composable_kernel commit: 3d29bff2f0 ]
2025-09-22 18:49:06 -07:00
Illia Silin
ada1b5f341
Split env.hpp header from the ck.hpp header. ( #2049 )
...
* split env.hpp out of main headers
* fix namespace logic
[ROCm/composable_kernel commit: 572cd820ce ]
2025-04-03 15:30:21 -07:00
valarLip
59e7fe3ac8
add int8 gemm multiply multiply a8w8 ( #1591 )
...
* add int8 gemm multiply multiply a8w8
* uncomment
* clang-format-12
* Add example_gemm_multiply_multiply_xdl_int8
* Remove shell scripts
* update preprocess number for mi308; bring back printout in ckprofiler
* format
---------
Co-authored-by: chenjun <junchen2@amd.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
Co-authored-by: carlushuang <carlus.huang@amd.com >
[ROCm/composable_kernel commit: 37f7afed1e ]
2024-10-26 16:39:34 +08:00
zjing14
87e7be2845
Add rotating buff for gemm_multi_d ( #1411 )
...
* add rotating_buff for gemm_multi_d
* format
* Update flush_cache.hpp
* Update gtest.cmake
---------
Co-authored-by: Jing Zhang <jizhan@fb.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
[ROCm/composable_kernel commit: 105bd708c7 ]
2024-07-25 23:21:21 +08:00
Bartłomiej Kocot
b4b436d29a
Optimize grouped conv bwd weight for small M and N ( #1303 )
...
* Optimize grouped conv bwd weight for small M and N
* Fixes
[ROCm/composable_kernel commit: fd72380aeb ]
2024-05-22 21:01:01 +02:00
Illia Silin
0003dce849
replace the ENV macro with CK_ENV ( #1296 )
...
[ROCm/composable_kernel commit: 1274861a9d ]
2024-05-17 10:42:51 -07:00
Illia Silin
ffe52d2d30
fix the output formatting ( #1282 )
...
[ROCm/composable_kernel commit: fdbf8ccbd7 ]
2024-05-08 16:11:54 -07:00
Illia Silin
e88d576926
Enable logging in CK with environment variable. ( #1278 )
...
* enable logging using environment variable
* update ck.hpp header
* fix typo
* fix clang format
* Update include/ck/utility/env.hpp
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
[ROCm/composable_kernel commit: bf42097646 ]
2024-05-07 16:26:43 -07:00
Illia Silin
ba9ffb86c7
add missing vector header ( #1275 )
...
[ROCm/composable_kernel commit: 08d51d9bc4 ]
2024-05-02 11:27:59 -07:00
ltqin
b4f3b8e693
Universal gemm flush cache ( #1251 )
...
* add flush cache to device op
* add flush cache parameter to ckProfiler
* change calculate size a and b method
* chang evaluation time method foro AVERAGE to MEDIAN
* format code
* adjust some code
* fix core dumped
* remove loop call flush icache in kernel
* remove loop(outer) call flush icache
---------
Co-authored-by: letaoqin <letaoqin@amd.com >
[ROCm/composable_kernel commit: f448d179b7 ]
2024-04-25 15:07:14 -05:00