Yanxing-Shi
|
ec1e45609b
|
merge support_engine_benchmark branch
|
2025-05-26 06:30:15 +00:00 |
|
Yanxing-Shi
|
00d10075d6
|
add sparse flag to gen more instances
|
2025-05-26 06:06:16 +00:00 |
|
Yanxing-Shi
|
9510d3df1f
|
default config test & fix codegen bug
|
2025-05-26 04:33:44 +00:00 |
|
khuagarw
|
2ce19b36ef
|
filter out some inmstances
|
2025-05-22 21:26:23 +00:00 |
|
Yanxing-Shi
|
43ac895597
|
fix query for insert
|
2025-05-22 16:28:19 +00:00 |
|
Yanxing-Shi
|
ecf403a430
|
initial commit, but resul=0 bug
|
2025-05-22 10:11:05 +00:00 |
|
Yanxing-Shi
|
40cd09a93d
|
refactor gemm profiler
|
2025-05-22 07:58:10 +00:00 |
|
Yanxing-Shi
|
365e80638a
|
fix gemm host template func
|
2025-05-22 07:14:37 +00:00 |
|
khuagarw
|
baf923da13
|
Disable sparsity for invalid warp tile combinations
|
2025-05-21 22:11:28 +00:00 |
|
Yanxing-Shi
|
bb66c2af3e
|
disable warning output & enable default config
|
2025-05-21 09:47:57 +00:00 |
|
Yanxing-Shi
|
4dcbc7e3d8
|
fix jenkins & delete extra header
|
2025-05-20 16:08:30 +00:00 |
|
Yanxing-Shi
|
3506722e6a
|
add benchmark gemm executable
|
2025-05-20 15:41:19 +00:00 |
|
Yanxing-Shi
|
ee6b7f9246
|
add kernel instance object
|
2025-05-20 08:57:48 +00:00 |
|
Yanxing-Shi
|
5b83f76eb0
|
fix codegen bug
|
2025-05-19 14:03:16 +00:00 |
|
Yanxing-Shi
|
b3caa67694
|
fix csv bug
|
2025-05-19 11:44:26 +00:00 |
|
Yanxing-Shi
|
9897410acf
|
refactor profiler
|
2025-05-19 10:42:57 +00:00 |
|
Yanxing-Shi
|
c821b1253a
|
format python
|
2025-05-16 10:41:30 +00:00 |
|
Yanxing-Shi
|
012c77125a
|
recover benchmark_gemm and fix
|
2025-05-16 10:37:59 +00:00 |
|
Yanxing-Shi
|
68a4aff0b1
|
fix config
|
2025-05-15 14:20:17 +00:00 |
|
Yanxing-Shi
|
d4107f55cf
|
remove pydantic module
|
2025-05-15 13:54:26 +00:00 |
|
Yanxing-Shi
|
fc092038f7
|
fix README
|
2025-05-15 12:37:00 +00:00 |
|
Yanxing-Shi
|
ccf18b90e6
|
add asm cache control
|
2025-05-15 12:20:28 +00:00 |
|
Yanxing-Shi
|
047f6e4480
|
python format
|
2025-05-15 11:16:13 +00:00 |
|
Yanxing-Shi
|
62d2a63f43
|
add benchmark for cold and warmp up
|
2025-05-15 11:11:18 +00:00 |
|
Yanxing-Shi
|
cfbbae9bd6
|
fix
|
2025-05-15 06:15:38 +00:00 |
|
Yanxing-Shi
|
53c4429f37
|
format
|
2025-05-14 15:46:50 +00:00 |
|
Yanxing-Shi
|
4bbe7eca09
|
add cmake option & modify
|
2025-05-14 09:17:37 +00:00 |
|
Yanxing-Shi
|
58ab4eb617
|
remove comment
|
2025-05-13 16:22:22 +00:00 |
|
Yanxing-Shi
|
6086e3641d
|
fix config
|
2025-05-13 15:56:37 +00:00 |
|
Yanxing-Shi
|
b4053e1ed3
|
remove config property
|
2025-05-13 15:35:42 +00:00 |
|
Yanxing-Shi
|
e5a7abd11b
|
move struct
|
2025-05-13 15:25:08 +00:00 |
|
Yanxing-Shi
|
3140659357
|
fix
|
2025-05-13 14:18:16 +00:00 |
|
Yanxing-Shi
|
0c3dc06e8c
|
test success
|
2025-05-13 13:14:44 +00:00 |
|
Yanxing-Shi
|
6c82b60de6
|
range config
|
2025-05-13 09:20:55 +00:00 |
|
Yanxing-Shi
|
a8a19be1b0
|
merge
|
2025-05-13 07:39:51 +00:00 |
|
Yanxing-Shi
|
2d3dc763f8
|
merge
|
2025-05-13 06:27:16 +00:00 |
|
Yanxing-Shi
|
54d3d9468d
|
fix bug
|
2025-05-13 05:57:41 +00:00 |
|
Khushbu Agarwal
|
f05e45ba59
|
Disable SMFMA gfx90a (#2184)
* sparsity fix for gfx90a
* reverting tile_engine changes
|
2025-05-12 09:56:23 -07:00 |
|
Yanxing-Shi
|
267eb410cc
|
tmp save for gen blobs
|
2025-05-12 07:06:15 +00:00 |
|
Khushbu Agarwal
|
ef72a4b9bc
|
Disable SMFMA for gfx90a (#2182)
|
2025-05-09 00:18:07 -07:00 |
|
Thomas Ning
|
c757046d49
|
Revert "Disable the SMFMA instruction for gfx90a. (#2174)" (#2175)
This reverts commit a32d907771.
|
2025-05-08 00:07:03 -07:00 |
|
Khushbu Agarwal
|
a32d907771
|
Disable the SMFMA instruction for gfx90a. (#2174)
* remove smfma for gfx90a
* clang formatted
|
2025-05-07 23:09:22 -07:00 |
|
Khushbu Agarwal
|
c7b8e86e34
|
[CK_Tile] Simplified Mem pipeline (#2159)
* simplify code
* compiled the code
* Simplified example and codegen for mem pipeline
* Reveting config and universal gemm example
* clang formatted
* remove comments
* clang formatted
* Add memory operation changes for defualt pipeline
* fix config file
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com>
|
2025-05-07 18:37:31 -07:00 |
|
Yanxing-Shi
|
d3843d0ac0
|
fix cmake
|
2025-05-07 14:19:49 +00:00 |
|
Yanxing-Shi
|
1ccecf9a11
|
add default config
|
2025-05-07 10:59:36 +00:00 |
|
Yanxing-Shi
|
bc72ec4cfb
|
fix doc
|
2025-05-06 08:29:11 +00:00 |
|
Khushbu Agarwal
|
d58f2b8bd0
|
mfma_32x32x64_fp8/bf8 (#2148)
* support for mfma_32x32x64_fp8
* clang-formatted
* Fixing sparsity in codegen
|
2025-05-01 13:36:24 -07:00 |
|
Yanxing-Shi
|
45a74f2b24
|
fix
|
2025-05-01 12:06:15 +00:00 |
|
Yanxing-Shi
|
d3d32843b5
|
only support profile
|
2025-05-01 11:05:27 +00:00 |
|
Aviral Goel
|
1aea51d34e
|
[Tile Engine] Improved README.md (#2134)
* improved tile_engine readme
* changed ck tile explanation and json
* further improved readme
* fixed typo
|
2025-04-29 17:37:07 -07:00 |
|