ltqin
b4f3b8e693
Universal gemm flush cache ( #1251 )
...
* add flush cache to device op
* add flush cache parameter to ckProfiler
* change calculate size a and b method
* chang evaluation time method foro AVERAGE to MEDIAN
* format code
* adjust some code
* fix core dumped
* remove loop call flush icache in kernel
* remove loop(outer) call flush icache
---------
Co-authored-by: letaoqin <letaoqin@amd.com >
[ROCm/composable_kernel commit: f448d179b7 ]
2024-04-25 15:07:14 -05:00
Haocong WANG
ec7e5b1331
[GEMM] Optimization for MI200/300. ( #1135 )
...
* Optimize GEMM on MI200/300:
1. Add new blockwise gemm pipeline
2. Add irregular splitk intances
* clang format + typo fix
* Fix a bug
[ROCm/composable_kernel commit: bb63b9732c ]
2024-01-19 07:02:22 -06:00
zjing14
04675fdf63
recover default niter ( #1064 )
...
[ROCm/composable_kernel commit: ae5e5181aa ]
2023-11-28 12:18:42 -08:00
zjing14
b88a739b88
Improve 4k gemm perf ( #1047 )
...
* improve 4k gemm perf
* add f8 instances
* format
---------
Co-authored-by: Jing Zhang <jizha@amd.com >
[ROCm/composable_kernel commit: e8cddfdc3b ]
2023-11-17 07:06:24 -06:00
Illia Silin
b57fbee2f1
update copyright headers ( #726 )
...
[ROCm/composable_kernel commit: b94fd0b227 ]
2023-05-31 18:46:57 -05:00
Chao Liu
c1cfd4c894
disable print for group conv multiple D ( #421 )
...
[ROCm/composable_kernel commit: 43c898f6ff ]
2022-09-16 09:46:32 -05:00
Chao Liu
31706d4896
add license in file ( #303 )
...
[ROCm/composable_kernel commit: d3051d7517 ]
2022-06-24 23:32:43 -05:00
JD
69d5f78b16
Add host API ( #220 )
...
* Add host API
* manually rebase on develop
* clean
* manually rebase on develop
* exclude tests from all target
* address review comments
* update client app name
* fix missing lib name
* clang-format update
* refactor
* refactor
* refactor
* refactor
* refactor
* fix test issue
* refactor
* refactor
* refactor
* upate cmake and readme
Co-authored-by: Chao Liu <chao.liu2@amd.com >
[ROCm/composable_kernel commit: cec69bc3bc ]
2022-05-12 09:21:01 -05:00