Commit Graph

19 Commits

Author SHA1 Message Date
Yanxing-Shi
926bd2b985 fix conflict 2025-05-28 09:08:42 +00:00
Khushbu Agarwal
99857e10e6 [CK_tile] Add rotating buffer feature for universal gemm (#2200)
* Add rotating buffer feature for universal gemm

* adding changes in tile_engine

* Updated code to merge kernel_launch

* removing comments

* Enable rotating buffer changes to flatmm

* Created diff launch_kernel function for rotating buffer

* Simplfied calculation using macros

* merge code with new changes in tile_engine

* clang formatted

* Redefine macros
2025-05-27 23:00:58 -07:00
Yanxing-Shi
b88be7fff3 merge upstream 2025-05-27 09:31:20 +00:00
Casey-Shi
128f5a1eab [Tile Engine] Add benchmark for tile engine gemm. (#2193)
* initial commit -m benchmark

* only support profile

* fix

* fix doc

* add default config

* add ci

* fix cmake

* tmp save for gen blobs

* fix bug

* merge

* range config

* test success

* fix

* fix

* move struct

* remove config property

* fix config

* remove comment

* add cmake option & modify

* add changelog

* fix

* format

* add pydantic module to the docker image

* fix

* add benchmark for cold and warmp up

* python format

* add asm cache control

* fix README

* remove pydantic module

* modify changelog

* fix config

* recover benchmark_gemm and fix

* format python

* refactor profiler

* fix csv bug

* fix codegen bug

* add kernel instance object

* add benchmark gemm executable

* fix jenkins & delete extra header

* disable warning output & enable default config

* Disable sparsity for invalid warp tile combinations

* fix gemm host template func

* refactor gemm profiler

* filter out some inmstances

* default config test & fix codegen bug

* add sparse flag to gen more instances

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: khuagarw <khuagarw@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-05-26 22:32:36 -07:00
Yanxing-Shi
7549e2b2e6 fix readme 2025-05-26 06:45:36 +00:00
Yanxing-Shi
3506722e6a add benchmark gemm executable 2025-05-20 15:41:19 +00:00
Yanxing-Shi
9897410acf refactor profiler 2025-05-19 10:42:57 +00:00
Yanxing-Shi
012c77125a recover benchmark_gemm and fix 2025-05-16 10:37:59 +00:00
Yanxing-Shi
fc092038f7 fix README 2025-05-15 12:37:00 +00:00
Yanxing-Shi
3140659357 fix 2025-05-13 14:18:16 +00:00
Yanxing-Shi
a8a19be1b0 merge 2025-05-13 07:39:51 +00:00
Yanxing-Shi
2d3dc763f8 merge 2025-05-13 06:27:16 +00:00
Yanxing-Shi
267eb410cc tmp save for gen blobs 2025-05-12 07:06:15 +00:00
Yanxing-Shi
1ccecf9a11 add default config 2025-05-07 10:59:36 +00:00
Yanxing-Shi
bc72ec4cfb fix doc 2025-05-06 08:29:11 +00:00
Yanxing-Shi
d3d32843b5 only support profile 2025-05-01 11:05:27 +00:00
Aviral Goel
1aea51d34e [Tile Engine] Improved README.md (#2134)
* improved tile_engine readme

* changed ck tile explanation and json

* further improved readme

* fixed typo
2025-04-29 17:37:07 -07:00
Khushbu Agarwal
768c99eca9 [TileEngine] Support for sparsity in codegen (#2128)
* Added sparsity flag in codegen

* remove comments

* clan formatted

* added sparsity as runtime argument

* updated README

* updated stream config variable

* fix typo for tail_num in hot loop
2025-04-28 18:19:23 -07:00
Khushbu Agarwal
7cadf187e2 multi instance generation for CkTileEngine (#2080)
* Add support for multi-instance verification, print detail for each instance, documentation fix

* clang formatted

* Added Readme file

* updated readme

* Addressing review comments

* clang formatted

* Updated ReadMe and GPU reference code

* simplified dispatch kernel code

* indentation
2025-04-21 08:39:45 -07:00