Khushbu Agarwal
3d8d6e75e4
Adding validation for tile sizes in Tile Engine ( #2189 )
...
* Adding validation for tile sizes
* Add architecture in config, and shuffle lines of code in warp_gemm.hpp
* Enable MFMA for gfx950, and invalid tile handling
2025-05-15 10:28:31 -07:00
Khushbu Agarwal
f05e45ba59
Disable SMFMA gfx90a ( #2184 )
...
* sparsity fix for gfx90a
* reverting tile_engine changes
2025-05-12 09:56:23 -07:00
Khushbu Agarwal
ef72a4b9bc
Disable SMFMA for gfx90a ( #2182 )
2025-05-09 00:18:07 -07:00
Thomas Ning
c757046d49
Revert "Disable the SMFMA instruction for gfx90a. ( #2174 )" ( #2175 )
...
This reverts commit a32d907771 .
2025-05-08 00:07:03 -07:00
Khushbu Agarwal
a32d907771
Disable the SMFMA instruction for gfx90a. ( #2174 )
...
* remove smfma for gfx90a
* clang formatted
2025-05-07 23:09:22 -07:00
Khushbu Agarwal
c7b8e86e34
[CK_Tile] Simplified Mem pipeline ( #2159 )
...
* simplify code
* compiled the code
* Simplified example and codegen for mem pipeline
* Reveting config and universal gemm example
* clang formatted
* remove comments
* clang formatted
* Add memory operation changes for defualt pipeline
* fix config file
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
2025-05-07 18:37:31 -07:00
Khushbu Agarwal
d58f2b8bd0
mfma_32x32x64_fp8/bf8 ( #2148 )
...
* support for mfma_32x32x64_fp8
* clang-formatted
* Fixing sparsity in codegen
2025-05-01 13:36:24 -07:00
Aviral Goel
1aea51d34e
[Tile Engine] Improved README.md ( #2134 )
...
* improved tile_engine readme
* changed ck tile explanation and json
* further improved readme
* fixed typo
2025-04-29 17:37:07 -07:00
Khushbu Agarwal
d107f3c3a5
Support for MFMA_16x16x128 for fp8/bf8 ( #2125 )
...
* Adding 16x16x128 support for gfx950
* Support for fp8 and bf8
* fix input arguments for MFMA scale instruction
* clang-formatted
* Fixes for lwpck-3145 (#2138 )
* Fix lds tile & cmake dep & default epilogue
* Fallback BTypeToUse to ADataType in WOQ cases
* reverting instance json file
* reverting instance json file
---------
Co-authored-by: Yi DING <yi.ding@amd.com >
2025-04-28 18:19:50 -07:00
Khushbu Agarwal
768c99eca9
[TileEngine] Support for sparsity in codegen ( #2128 )
...
* Added sparsity flag in codegen
* remove comments
* clan formatted
* added sparsity as runtime argument
* updated README
* updated stream config variable
* fix typo for tail_num in hot loop
2025-04-28 18:19:23 -07:00
Khushbu Agarwal
94662b02d0
Adding include directory in tile_engine ( #2116 )
2025-04-22 15:55:19 -07:00
Khushbu Agarwal
7cadf187e2
multi instance generation for CkTileEngine ( #2080 )
...
* Add support for multi-instance verification, print detail for each instance, documentation fix
* clang formatted
* Added Readme file
* updated readme
* Addressing review comments
* clang formatted
* Updated ReadMe and GPU reference code
* simplified dispatch kernel code
* indentation
2025-04-21 08:39:45 -07:00
Khushbu Agarwal
3bda57c204
file clang formatted ( #2053 )
2025-04-03 16:55:49 -07:00
Khushbu Agarwal
b443056a26
Documentation for newly added struct ( #2051 )
2025-04-03 16:24:34 -07:00
Khushbu Agarwal
fed0709121
[New] Build up the feature of CK Tile GEMM CodeGen ( #1994 )
...
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* code changes for single instance
* config file rename, added few more combinations in json file
* Fix cmake file
* Addressing review comments
* Reverting files changed by merge to develop
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
2025-04-03 11:54:12 -07:00