Aviral Goel
9d5b89ce8a
[Tile Engine] Improved README.md ( #2134 )
...
* improved tile_engine readme
* changed ck tile explanation and json
* further improved readme
* fixed typo
[ROCm/composable_kernel commit: 1aea51d34e ]
2025-04-29 17:37:07 -07:00
Khushbu Agarwal
7795e976da
Support for MFMA_16x16x128 for fp8/bf8 ( #2125 )
...
* Adding 16x16x128 support for gfx950
* Support for fp8 and bf8
* fix input arguments for MFMA scale instruction
* clang-formatted
* Fixes for lwpck-3145 (#2138 )
* Fix lds tile & cmake dep & default epilogue
* Fallback BTypeToUse to ADataType in WOQ cases
* reverting instance json file
* reverting instance json file
---------
Co-authored-by: Yi DING <yi.ding@amd.com >
[ROCm/composable_kernel commit: d107f3c3a5 ]
2025-04-28 18:19:50 -07:00
Khushbu Agarwal
a75ab12f3a
[TileEngine] Support for sparsity in codegen ( #2128 )
...
* Added sparsity flag in codegen
* remove comments
* clan formatted
* added sparsity as runtime argument
* updated README
* updated stream config variable
* fix typo for tail_num in hot loop
[ROCm/composable_kernel commit: 768c99eca9 ]
2025-04-28 18:19:23 -07:00
Khushbu Agarwal
03cdc5602a
Adding include directory in tile_engine ( #2116 )
...
[ROCm/composable_kernel commit: 94662b02d0 ]
2025-04-22 15:55:19 -07:00
Khushbu Agarwal
790dfe9bcd
multi instance generation for CkTileEngine ( #2080 )
...
* Add support for multi-instance verification, print detail for each instance, documentation fix
* clang formatted
* Added Readme file
* updated readme
* Addressing review comments
* clang formatted
* Updated ReadMe and GPU reference code
* simplified dispatch kernel code
* indentation
[ROCm/composable_kernel commit: 7cadf187e2 ]
2025-04-21 08:39:45 -07:00
Khushbu Agarwal
50c53c7252
file clang formatted ( #2053 )
...
[ROCm/composable_kernel commit: 3bda57c204 ]
2025-04-03 16:55:49 -07:00
Khushbu Agarwal
9b9f33d37e
Documentation for newly added struct ( #2051 )
...
[ROCm/composable_kernel commit: b443056a26 ]
2025-04-03 16:24:34 -07:00
Khushbu Agarwal
eee09ecdb3
[New] Build up the feature of CK Tile GEMM CodeGen ( #1994 )
...
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* New branch for codegen changes
* Fix verify function for int4
* pk_int4 codegen
* Update to review comments
* Remove codegen directory and rename filenames
* Remove extra files; clean up CMake file
* code changes for single instance
* config file rename, added few more combinations in json file
* Fix cmake file
* Addressing review comments
* Reverting files changed by merge to develop
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
[ROCm/composable_kernel commit: fed0709121 ]
2025-04-03 11:54:12 -07:00