Commit Graph

10 Commits

Author SHA1 Message Date
logicat
fec81109f1 Remove unnecessary hip_fp16 include from stream_config (#3549) 2026-01-16 10:40:05 -08:00
Aviral Goel
f5ac3ee359 chore(copyright): update copyright header for include directory (#3224)
* chore(copyright): update copyright header for tile_engine directory

* chore(copyright): update copyright header for script directory

* chore(copyright): update copyright header for test_data directory

* chore(copyright): update copyright header for python directory

* chore(copyright): update copyright header for profiler directory

* chore(copyright): update copyright header for library directory

* chore(copyright): update copyright header for include directory
2025-11-18 10:17:18 -08:00
ltqin
f448d179b7 Universal gemm flush cache (#1251)
* add flush cache to device op

* add flush cache parameter to ckProfiler

* change calculate size a and b method

* chang evaluation time method foro AVERAGE to MEDIAN

* format code

* adjust some code

* fix core dumped

* remove loop call flush icache in kernel

* remove loop(outer) call flush icache

---------

Co-authored-by: letaoqin <letaoqin@amd.com>
2024-04-25 15:07:14 -05:00
Haocong WANG
bb63b9732c [GEMM] Optimization for MI200/300. (#1135)
* Optimize GEMM on MI200/300:
1. Add new blockwise gemm pipeline
2. Add irregular splitk intances

* clang format + typo fix

* Fix a bug
2024-01-19 07:02:22 -06:00
zjing14
ae5e5181aa recover default niter (#1064) 2023-11-28 12:18:42 -08:00
zjing14
e8cddfdc3b Improve 4k gemm perf (#1047)
* improve 4k gemm perf

* add f8 instances

* format

---------

Co-authored-by: Jing Zhang <jizha@amd.com>
2023-11-17 07:06:24 -06:00
Illia Silin
b94fd0b227 update copyright headers (#726) 2023-05-31 18:46:57 -05:00
Chao Liu
43c898f6ff disable print for group conv multiple D (#421) 2022-09-16 09:46:32 -05:00
Chao Liu
d3051d7517 add license in file (#303) 2022-06-24 23:32:43 -05:00
JD
cec69bc3bc Add host API (#220)
* Add host API

* manually rebase on develop

* clean

* manually rebase on develop

* exclude tests from all target

* address review comments

* update client app name

* fix missing lib name

* clang-format update

* refactor

* refactor

* refactor

* refactor

* refactor

* fix test issue

* refactor

* refactor

* refactor

* upate cmake and readme

Co-authored-by: Chao Liu <chao.liu2@amd.com>
2022-05-12 09:21:01 -05:00