Haocong WANG
|
bb63b9732c
|
[GEMM] Optimization for MI200/300. (#1135)
* Optimize GEMM on MI200/300:
1. Add new blockwise gemm pipeline
2. Add irregular splitk intances
* clang format + typo fix
* Fix a bug
|
2024-01-19 07:02:22 -06:00 |
|
zjing14
|
ae5e5181aa
|
recover default niter (#1064)
|
2023-11-28 12:18:42 -08:00 |
|
zjing14
|
e8cddfdc3b
|
Improve 4k gemm perf (#1047)
* improve 4k gemm perf
* add f8 instances
* format
---------
Co-authored-by: Jing Zhang <jizha@amd.com>
|
2023-11-17 07:06:24 -06:00 |
|
Illia Silin
|
b94fd0b227
|
update copyright headers (#726)
|
2023-05-31 18:46:57 -05:00 |
|
Chao Liu
|
43c898f6ff
|
disable print for group conv multiple D (#421)
|
2022-09-16 09:46:32 -05:00 |
|
Chao Liu
|
d3051d7517
|
add license in file (#303)
|
2022-06-24 23:32:43 -05:00 |
|
JD
|
cec69bc3bc
|
Add host API (#220)
* Add host API
* manually rebase on develop
* clean
* manually rebase on develop
* exclude tests from all target
* address review comments
* update client app name
* fix missing lib name
* clang-format update
* refactor
* refactor
* refactor
* refactor
* refactor
* fix test issue
* refactor
* refactor
* refactor
* upate cmake and readme
Co-authored-by: Chao Liu <chao.liu2@amd.com>
|
2022-05-12 09:21:01 -05:00 |
|