Yanxing-Shi
926bd2b985
fix conflict
2025-05-28 09:08:42 +00:00
Khushbu Agarwal
99857e10e6
[CK_tile] Add rotating buffer feature for universal gemm ( #2200 )
...
* Add rotating buffer feature for universal gemm
* adding changes in tile_engine
* Updated code to merge kernel_launch
* removing comments
* Enable rotating buffer changes to flatmm
* Created diff launch_kernel function for rotating buffer
* Simplfied calculation using macros
* merge code with new changes in tile_engine
* clang formatted
* Redefine macros
2025-05-27 23:00:58 -07:00
Yanxing-Shi
dd0248aedb
fix cmake for sqlite3
2025-05-27 10:24:52 +00:00
Yanxing-Shi
cd89d1746d
add header
2025-05-27 09:50:23 +00:00
Yanxing-Shi
b88be7fff3
merge upstream
2025-05-27 09:31:20 +00:00
Casey-Shi
128f5a1eab
[Tile Engine] Add benchmark for tile engine gemm. ( #2193 )
...
* initial commit -m benchmark
* only support profile
* fix
* fix doc
* add default config
* add ci
* fix cmake
* tmp save for gen blobs
* fix bug
* merge
* range config
* test success
* fix
* fix
* move struct
* remove config property
* fix config
* remove comment
* add cmake option & modify
* add changelog
* fix
* format
* add pydantic module to the docker image
* fix
* add benchmark for cold and warmp up
* python format
* add asm cache control
* fix README
* remove pydantic module
* modify changelog
* fix config
* recover benchmark_gemm and fix
* format python
* refactor profiler
* fix csv bug
* fix codegen bug
* add kernel instance object
* add benchmark gemm executable
* fix jenkins & delete extra header
* disable warning output & enable default config
* Disable sparsity for invalid warp tile combinations
* fix gemm host template func
* refactor gemm profiler
* filter out some inmstances
* default config test & fix codegen bug
* add sparse flag to gen more instances
---------
Co-authored-by: illsilin <Illia.Silin@amd.com >
Co-authored-by: khuagarw <khuagarw@amd.com >
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
2025-05-26 22:32:36 -07:00
Yanxing-Shi
7549e2b2e6
fix readme
2025-05-26 06:45:36 +00:00
Yanxing-Shi
ec1e45609b
merge support_engine_benchmark branch
2025-05-26 06:30:15 +00:00
Yanxing-Shi
00d10075d6
add sparse flag to gen more instances
2025-05-26 06:06:16 +00:00
Yanxing-Shi
9510d3df1f
default config test & fix codegen bug
2025-05-26 04:33:44 +00:00
khuagarw
2ce19b36ef
filter out some inmstances
2025-05-22 21:26:23 +00:00
Yanxing-Shi
43ac895597
fix query for insert
2025-05-22 16:28:19 +00:00
Yanxing-Shi
ecf403a430
initial commit, but resul=0 bug
2025-05-22 10:11:05 +00:00
Yanxing-Shi
40cd09a93d
refactor gemm profiler
2025-05-22 07:58:10 +00:00
Yanxing-Shi
365e80638a
fix gemm host template func
2025-05-22 07:14:37 +00:00
khuagarw
baf923da13
Disable sparsity for invalid warp tile combinations
2025-05-21 22:11:28 +00:00
Yanxing-Shi
bb66c2af3e
disable warning output & enable default config
2025-05-21 09:47:57 +00:00
Yanxing-Shi
4dcbc7e3d8
fix jenkins & delete extra header
2025-05-20 16:08:30 +00:00
Yanxing-Shi
3506722e6a
add benchmark gemm executable
2025-05-20 15:41:19 +00:00
Yanxing-Shi
ee6b7f9246
add kernel instance object
2025-05-20 08:57:48 +00:00
Yanxing-Shi
5b83f76eb0
fix codegen bug
2025-05-19 14:03:16 +00:00
Yanxing-Shi
b3caa67694
fix csv bug
2025-05-19 11:44:26 +00:00
Yanxing-Shi
9897410acf
refactor profiler
2025-05-19 10:42:57 +00:00
Yanxing-Shi
c821b1253a
format python
2025-05-16 10:41:30 +00:00
Yanxing-Shi
012c77125a
recover benchmark_gemm and fix
2025-05-16 10:37:59 +00:00
Khushbu Agarwal
3d8d6e75e4
Adding validation for tile sizes in Tile Engine ( #2189 )
...
* Adding validation for tile sizes
* Add architecture in config, and shuffle lines of code in warp_gemm.hpp
* Enable MFMA for gfx950, and invalid tile handling
2025-05-15 10:28:31 -07:00
Yanxing-Shi
68a4aff0b1
fix config
2025-05-15 14:20:17 +00:00
Yanxing-Shi
d4107f55cf
remove pydantic module
2025-05-15 13:54:26 +00:00
Yanxing-Shi
fc092038f7
fix README
2025-05-15 12:37:00 +00:00
Yanxing-Shi
ccf18b90e6
add asm cache control
2025-05-15 12:20:28 +00:00
Yanxing-Shi
047f6e4480
python format
2025-05-15 11:16:13 +00:00
Yanxing-Shi
62d2a63f43
add benchmark for cold and warmp up
2025-05-15 11:11:18 +00:00
Yanxing-Shi
cfbbae9bd6
fix
2025-05-15 06:15:38 +00:00
Yanxing-Shi
53c4429f37
format
2025-05-14 15:46:50 +00:00
Yanxing-Shi
4bbe7eca09
add cmake option & modify
2025-05-14 09:17:37 +00:00
Yanxing-Shi
58ab4eb617
remove comment
2025-05-13 16:22:22 +00:00
Yanxing-Shi
6086e3641d
fix config
2025-05-13 15:56:37 +00:00
Yanxing-Shi
b4053e1ed3
remove config property
2025-05-13 15:35:42 +00:00
Yanxing-Shi
e5a7abd11b
move struct
2025-05-13 15:25:08 +00:00
Yanxing-Shi
3140659357
fix
2025-05-13 14:18:16 +00:00
Yanxing-Shi
0c3dc06e8c
test success
2025-05-13 13:14:44 +00:00
Yanxing-Shi
6c82b60de6
range config
2025-05-13 09:20:55 +00:00
Yanxing-Shi
a8a19be1b0
merge
2025-05-13 07:39:51 +00:00
Yanxing-Shi
2d3dc763f8
merge
2025-05-13 06:27:16 +00:00
Yanxing-Shi
54d3d9468d
fix bug
2025-05-13 05:57:41 +00:00
Khushbu Agarwal
f05e45ba59
Disable SMFMA gfx90a ( #2184 )
...
* sparsity fix for gfx90a
* reverting tile_engine changes
2025-05-12 09:56:23 -07:00
Yanxing-Shi
267eb410cc
tmp save for gen blobs
2025-05-12 07:06:15 +00:00
Khushbu Agarwal
ef72a4b9bc
Disable SMFMA for gfx90a ( #2182 )
2025-05-09 00:18:07 -07:00
Thomas Ning
c757046d49
Revert "Disable the SMFMA instruction for gfx90a. ( #2174 )" ( #2175 )
...
This reverts commit a32d907771 .
2025-05-08 00:07:03 -07:00
Khushbu Agarwal
a32d907771
Disable the SMFMA instruction for gfx90a. ( #2174 )
...
* remove smfma for gfx90a
* clang formatted
2025-05-07 23:09:22 -07:00