Commit Graph

8 Commits

Author SHA1 Message Date
Aviral Goel
b537ae86b9 [Tile Engine] Improved README.md (#2134)
* improved tile_engine readme

* changed ck tile explanation and json

* further improved readme

* fixed typo

[ROCm/composable_kernel commit: 1aea51d34e]
2025-04-29 17:37:07 -07:00
Khushbu Agarwal
aeb46e6a49 Support for MFMA_16x16x128 for fp8/bf8 (#2125)
* Adding 16x16x128 support for gfx950

* Support for fp8 and bf8

* fix input arguments for MFMA scale instruction

* clang-formatted

* Fixes for lwpck-3145 (#2138)

* Fix lds tile & cmake dep & default epilogue

* Fallback BTypeToUse to ADataType in WOQ cases

* reverting instance json file

* reverting instance json file

---------

Co-authored-by: Yi DING <yi.ding@amd.com>

[ROCm/composable_kernel commit: d107f3c3a5]
2025-04-28 18:19:50 -07:00
Khushbu Agarwal
10188b5103 [TileEngine] Support for sparsity in codegen (#2128)
* Added sparsity flag in codegen

* remove comments

* clan formatted

* added sparsity as runtime argument

* updated README

* updated stream config variable

* fix typo for tail_num in hot loop

[ROCm/composable_kernel commit: 768c99eca9]
2025-04-28 18:19:23 -07:00
Khushbu Agarwal
c6b7f48326 Adding include directory in tile_engine (#2116)
[ROCm/composable_kernel commit: 94662b02d0]
2025-04-22 15:55:19 -07:00
Khushbu Agarwal
74210a9dfc multi instance generation for CkTileEngine (#2080)
* Add support for multi-instance verification, print detail for each instance, documentation fix

* clang formatted

* Added Readme file

* updated readme

* Addressing review comments

* clang formatted

* Updated ReadMe and GPU reference code

* simplified dispatch kernel code

* indentation

[ROCm/composable_kernel commit: 7cadf187e2]
2025-04-21 08:39:45 -07:00
Khushbu Agarwal
09792fa112 file clang formatted (#2053)
[ROCm/composable_kernel commit: 3bda57c204]
2025-04-03 16:55:49 -07:00
Khushbu Agarwal
844730776f Documentation for newly added struct (#2051)
[ROCm/composable_kernel commit: b443056a26]
2025-04-03 16:24:34 -07:00
Khushbu Agarwal
b85b103194 [New] Build up the feature of CK Tile GEMM CodeGen (#1994)
* New branch for codegen changes

* Fix verify function for int4

* pk_int4 codegen

* Update to review comments

* Remove codegen directory and rename filenames

* Remove extra files; clean up CMake file

* New branch for codegen changes

* Fix verify function for int4

* pk_int4 codegen

* Update to review comments

* Remove codegen directory and rename filenames

* Remove extra files; clean up CMake file

* code changes for single instance

* config file rename, added few more combinations in json file

* Fix cmake file

* Addressing review comments

* Reverting files changed by merge to develop

---------

Co-authored-by: ThomasNing <thomas.ning@amd.com>

[ROCm/composable_kernel commit: fed0709121]
2025-04-03 11:54:12 -07:00