Aviral Goel
0cfa802f89
chore(copyright): update copyright header for include directory ( #3224 )
...
* chore(copyright): update copyright header for tile_engine directory
* chore(copyright): update copyright header for script directory
* chore(copyright): update copyright header for test_data directory
* chore(copyright): update copyright header for python directory
* chore(copyright): update copyright header for profiler directory
* chore(copyright): update copyright header for library directory
* chore(copyright): update copyright header for include directory
[ROCm/composable_kernel commit: f5ac3ee359 ]
2025-11-18 10:17:18 -08:00
Max Podkorytov
a1681b077e
[CK][host] limit the rotating count to prevent oom ( #3089 )
...
* [CK][host] limit the rotating count to prevent oom
* add numeric header for accumulate
[ROCm/composable_kernel commit: f39626fcf7 ]
2025-10-24 08:55:54 -07:00
Enrico Degregori
12225ce645
Wmma support for multiple ABD GEMM ( #2803 )
...
* multi_abd wmma support:
- Add multiple A and B support to multiple D implementation (gridwise level)
- Add multi_abd GEMM (device level)
- Add instances (xdl parity)
- Add tests (both xdl and wmma)
- Add examples
- Add ckProfiler support (both xdl and wmma)
* Fix bug in device print function
* Fix unused template parameter
* Fix batched gemm for multiABD gridwise implementation
* Fix gemm_universal_reduce with multiABDs gridwise implementation
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
[ROCm/composable_kernel commit: 3d29bff2f0 ]
2025-09-22 18:49:06 -07:00
Illia Silin
ada1b5f341
Split env.hpp header from the ck.hpp header. ( #2049 )
...
* split env.hpp out of main headers
* fix namespace logic
[ROCm/composable_kernel commit: 572cd820ce ]
2025-04-03 15:30:21 -07:00
valarLip
59e7fe3ac8
add int8 gemm multiply multiply a8w8 ( #1591 )
...
* add int8 gemm multiply multiply a8w8
* uncomment
* clang-format-12
* Add example_gemm_multiply_multiply_xdl_int8
* Remove shell scripts
* update preprocess number for mi308; bring back printout in ckprofiler
* format
---------
Co-authored-by: chenjun <junchen2@amd.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
Co-authored-by: carlushuang <carlus.huang@amd.com >
[ROCm/composable_kernel commit: 37f7afed1e ]
2024-10-26 16:39:34 +08:00
zjing14
87e7be2845
Add rotating buff for gemm_multi_d ( #1411 )
...
* add rotating_buff for gemm_multi_d
* format
* Update flush_cache.hpp
* Update gtest.cmake
---------
Co-authored-by: Jing Zhang <jizhan@fb.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
[ROCm/composable_kernel commit: 105bd708c7 ]
2024-07-25 23:21:21 +08:00
Bartłomiej Kocot
b4b436d29a
Optimize grouped conv bwd weight for small M and N ( #1303 )
...
* Optimize grouped conv bwd weight for small M and N
* Fixes
[ROCm/composable_kernel commit: fd72380aeb ]
2024-05-22 21:01:01 +02:00
Illia Silin
0003dce849
replace the ENV macro with CK_ENV ( #1296 )
...
[ROCm/composable_kernel commit: 1274861a9d ]
2024-05-17 10:42:51 -07:00
Illia Silin
ffe52d2d30
fix the output formatting ( #1282 )
...
[ROCm/composable_kernel commit: fdbf8ccbd7 ]
2024-05-08 16:11:54 -07:00
Illia Silin
e88d576926
Enable logging in CK with environment variable. ( #1278 )
...
* enable logging using environment variable
* update ck.hpp header
* fix typo
* fix clang format
* Update include/ck/utility/env.hpp
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
[ROCm/composable_kernel commit: bf42097646 ]
2024-05-07 16:26:43 -07:00
Illia Silin
ba9ffb86c7
add missing vector header ( #1275 )
...
[ROCm/composable_kernel commit: 08d51d9bc4 ]
2024-05-02 11:27:59 -07:00
ltqin
b4f3b8e693
Universal gemm flush cache ( #1251 )
...
* add flush cache to device op
* add flush cache parameter to ckProfiler
* change calculate size a and b method
* chang evaluation time method foro AVERAGE to MEDIAN
* format code
* adjust some code
* fix core dumped
* remove loop call flush icache in kernel
* remove loop(outer) call flush icache
---------
Co-authored-by: letaoqin <letaoqin@amd.com >
[ROCm/composable_kernel commit: f448d179b7 ]
2024-04-25 15:07:14 -05:00