Yanxing-Shi
bb66c2af3e
disable warning output & enable default config
2025-05-21 09:47:57 +00:00
Yanxing-Shi
4dcbc7e3d8
fix jenkins & delete extra header
2025-05-20 16:08:30 +00:00
Yanxing-Shi
3506722e6a
add benchmark gemm executable
2025-05-20 15:41:19 +00:00
Yanxing-Shi
ee6b7f9246
add kernel instance object
2025-05-20 08:57:48 +00:00
Yanxing-Shi
5b83f76eb0
fix codegen bug
2025-05-19 14:03:16 +00:00
Yanxing-Shi
b3caa67694
fix csv bug
2025-05-19 11:44:26 +00:00
Yanxing-Shi
9897410acf
refactor profiler
2025-05-19 10:42:57 +00:00
Yanxing-Shi
c821b1253a
format python
2025-05-16 10:41:30 +00:00
Yanxing-Shi
012c77125a
recover benchmark_gemm and fix
2025-05-16 10:37:59 +00:00
Yanxing-Shi
68a4aff0b1
fix config
2025-05-15 14:20:17 +00:00
Yanxing-Shi
d4107f55cf
remove pydantic module
2025-05-15 13:54:26 +00:00
Yanxing-Shi
fc092038f7
fix README
2025-05-15 12:37:00 +00:00
Yanxing-Shi
ccf18b90e6
add asm cache control
2025-05-15 12:20:28 +00:00
Yanxing-Shi
047f6e4480
python format
2025-05-15 11:16:13 +00:00
Yanxing-Shi
62d2a63f43
add benchmark for cold and warmp up
2025-05-15 11:11:18 +00:00
Yanxing-Shi
cfbbae9bd6
fix
2025-05-15 06:15:38 +00:00
Yanxing-Shi
53c4429f37
format
2025-05-14 15:46:50 +00:00
Yanxing-Shi
4bbe7eca09
add cmake option & modify
2025-05-14 09:17:37 +00:00
Yanxing-Shi
58ab4eb617
remove comment
2025-05-13 16:22:22 +00:00
Yanxing-Shi
6086e3641d
fix config
2025-05-13 15:56:37 +00:00
Yanxing-Shi
b4053e1ed3
remove config property
2025-05-13 15:35:42 +00:00
Yanxing-Shi
e5a7abd11b
move struct
2025-05-13 15:25:08 +00:00
Yanxing-Shi
3140659357
fix
2025-05-13 14:18:16 +00:00
Yanxing-Shi
0c3dc06e8c
test success
2025-05-13 13:14:44 +00:00
Yanxing-Shi
6c82b60de6
range config
2025-05-13 09:20:55 +00:00
Yanxing-Shi
a8a19be1b0
merge
2025-05-13 07:39:51 +00:00
Yanxing-Shi
2d3dc763f8
merge
2025-05-13 06:27:16 +00:00
Yanxing-Shi
54d3d9468d
fix bug
2025-05-13 05:57:41 +00:00
Khushbu Agarwal
f05e45ba59
Disable SMFMA gfx90a ( #2184 )
...
* sparsity fix for gfx90a
* reverting tile_engine changes
2025-05-12 09:56:23 -07:00
Yanxing-Shi
267eb410cc
tmp save for gen blobs
2025-05-12 07:06:15 +00:00
Khushbu Agarwal
ef72a4b9bc
Disable SMFMA for gfx90a ( #2182 )
2025-05-09 00:18:07 -07:00
Thomas Ning
c757046d49
Revert "Disable the SMFMA instruction for gfx90a. ( #2174 )" ( #2175 )
...
This reverts commit a32d907771 .
2025-05-08 00:07:03 -07:00
Khushbu Agarwal
a32d907771
Disable the SMFMA instruction for gfx90a. ( #2174 )
...
* remove smfma for gfx90a
* clang formatted
2025-05-07 23:09:22 -07:00
Khushbu Agarwal
c7b8e86e34
[CK_Tile] Simplified Mem pipeline ( #2159 )
...
* simplify code
* compiled the code
* Simplified example and codegen for mem pipeline
* Reveting config and universal gemm example
* clang formatted
* remove comments
* clang formatted
* Add memory operation changes for defualt pipeline
* fix config file
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
2025-05-07 18:37:31 -07:00
Yanxing-Shi
d3843d0ac0
fix cmake
2025-05-07 14:19:49 +00:00
Yanxing-Shi
1ccecf9a11
add default config
2025-05-07 10:59:36 +00:00
Yanxing-Shi
bc72ec4cfb
fix doc
2025-05-06 08:29:11 +00:00
Khushbu Agarwal
d58f2b8bd0
mfma_32x32x64_fp8/bf8 ( #2148 )
...
* support for mfma_32x32x64_fp8
* clang-formatted
* Fixing sparsity in codegen
2025-05-01 13:36:24 -07:00
Yanxing-Shi
45a74f2b24
fix
2025-05-01 12:06:15 +00:00
Yanxing-Shi
d3d32843b5
only support profile
2025-05-01 11:05:27 +00:00
Aviral Goel
1aea51d34e
[Tile Engine] Improved README.md ( #2134 )
...
* improved tile_engine readme
* changed ck tile explanation and json
* further improved readme
* fixed typo
2025-04-29 17:37:07 -07:00
Khushbu Agarwal
d107f3c3a5
Support for MFMA_16x16x128 for fp8/bf8 ( #2125 )
...
* Adding 16x16x128 support for gfx950
* Support for fp8 and bf8
* fix input arguments for MFMA scale instruction
* clang-formatted
* Fixes for lwpck-3145 (#2138 )
* Fix lds tile & cmake dep & default epilogue
* Fallback BTypeToUse to ADataType in WOQ cases
* reverting instance json file
* reverting instance json file
---------
Co-authored-by: Yi DING <yi.ding@amd.com >
2025-04-28 18:19:50 -07:00
Khushbu Agarwal
768c99eca9
[TileEngine] Support for sparsity in codegen ( #2128 )
...
* Added sparsity flag in codegen
* remove comments
* clan formatted
* added sparsity as runtime argument
* updated README
* updated stream config variable
* fix typo for tail_num in hot loop
2025-04-28 18:19:23 -07:00
Yanxing-Shi
82186ae503
initial commit -m benchmark
2025-04-28 06:22:04 +00:00
Khushbu Agarwal
94662b02d0
Adding include directory in tile_engine ( #2116 )
2025-04-22 15:55:19 -07:00
Khushbu Agarwal
7cadf187e2
multi instance generation for CkTileEngine ( #2080 )
...
* Add support for multi-instance verification, print detail for each instance, documentation fix
* clang formatted
* Added Readme file
* updated readme
* Addressing review comments
* clang formatted
* Updated ReadMe and GPU reference code
* simplified dispatch kernel code
* indentation
2025-04-21 08:39:45 -07:00
Khushbu Agarwal
3bda57c204
file clang formatted ( #2053 )
2025-04-03 16:55:49 -07:00
Khushbu Agarwal
b443056a26
Documentation for newly added struct ( #2051 )
2025-04-03 16:24:34 -07:00
Khushbu Agarwal
fed0709121
[New] Build up the feature of CK Tile GEMM CodeGen ( #1994 )
...
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* code changes for single instance
* config file rename, added few more combinations in json file
* Fix cmake file
* Addressing review comments
* Reverting files changed by merge to develop
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
2025-04-03 11:54:12 -07:00