coderfeli
c8d9660f3b
using develop branch timer
2024-12-27 06:47:36 +00:00
aska-0096
55cb3bdee5
clean the flush_cache api
2024-11-05 10:10:11 +00:00
aska-0096
f20e48f1f4
Merge branch 'develop' of https://github.com/ROCm/composable_kernel into update_cka8w8
2024-11-05 07:03:42 +00:00
aska-0096
b97c68764e
update ck_a8w8 library, update flush cache timing api
2024-11-05 06:57:48 +00:00
aska-0096
b3e5048f12
tempsave
2024-10-30 07:38:59 +00:00
valarLip
37f7afed1e
add int8 gemm multiply multiply a8w8 ( #1591 )
...
* add int8 gemm multiply multiply a8w8
* uncomment
* clang-format-12
* Add example_gemm_multiply_multiply_xdl_int8
* Remove shell scripts
* update preprocess number for mi308; bring back printout in ckprofiler
* format
---------
Co-authored-by: chenjun <junchen2@amd.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
Co-authored-by: carlushuang <carlus.huang@amd.com >
2024-10-26 16:39:34 +08:00
aska-0096
e8c19535f7
update preprocess number for mi308; bring back printout in ckprofiler
2024-10-25 04:29:34 +00:00
chenjun
1670bba95f
clang-format-12
2024-10-21 23:16:04 +08:00
chenjun
7fb0b3223c
add int8 gemm multiply multiply a8w8
2024-10-21 21:57:41 +08:00
zjing14
105bd708c7
Add rotating buff for gemm_multi_d ( #1411 )
...
* add rotating_buff for gemm_multi_d
* format
* Update flush_cache.hpp
* Update gtest.cmake
---------
Co-authored-by: Jing Zhang <jizhan@fb.com >
Co-authored-by: Haocong WANG <haocwang@amd.com >
2024-07-25 23:21:21 +08:00
Bartłomiej Kocot
fd72380aeb
Optimize grouped conv bwd weight for small M and N ( #1303 )
...
* Optimize grouped conv bwd weight for small M and N
* Fixes
2024-05-22 21:01:01 +02:00
Illia Silin
1274861a9d
replace the ENV macro with CK_ENV ( #1296 )
2024-05-17 10:42:51 -07:00
Illia Silin
fdbf8ccbd7
fix the output formatting ( #1282 )
2024-05-08 16:11:54 -07:00
Illia Silin
bf42097646
Enable logging in CK with environment variable. ( #1278 )
...
* enable logging using environment variable
* update ck.hpp header
* fix typo
* fix clang format
* Update include/ck/utility/env.hpp
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
---------
Co-authored-by: Bartłomiej Kocot <barkocot@amd.com >
2024-05-07 16:26:43 -07:00
Illia Silin
08d51d9bc4
add missing vector header ( #1275 )
2024-05-02 11:27:59 -07:00
ltqin
f448d179b7
Universal gemm flush cache ( #1251 )
...
* add flush cache to device op
* add flush cache parameter to ckProfiler
* change calculate size a and b method
* chang evaluation time method foro AVERAGE to MEDIAN
* format code
* adjust some code
* fix core dumped
* remove loop call flush icache in kernel
* remove loop(outer) call flush icache
---------
Co-authored-by: letaoqin <letaoqin@amd.com >
2024-04-25 15:07:14 -05:00