Sami Remes
c964eb1186
[CK_TILE] Tileloop persistent gemm - resubmit ( #2299 )
...
* Reapply "[CK_TILE] Tile loop persistent gemm kernel (#2191 )" (#2293 )
This reverts commit 1d9fd3b6a8f8e84a407b8e59b63b17c258f4fb78.
* Add missing header for kentry
---------
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com >
[ROCm/composable_kernel commit: 1c6f83df6c ]
2025-06-06 14:18:49 -07:00
Illia Silin
4fba4073d3
Revert "[CK_TILE] Tile loop persistent gemm kernel ( #2191 )" ( #2293 )
...
This reverts commit 6b2a12ae04a22188acd1444e69d89b270525b79e.
[ROCm/composable_kernel commit: 233e274077 ]
2025-06-05 09:24:00 -07:00
Sami Remes
47d599c8e3
[CK_TILE] Tile loop persistent gemm kernel ( #2191 )
...
* Implement tile loop persistent gemm kernel
* Enable timing
* Add tests for persistent gemm
* Fix formatting
* Fix gemm_basic
* Rename True/False to Persistent/NonPersistent
* Use only one set of layouts for persistent tests
* Fix gemm example persistent template parameter
* Fix formatting
[ROCm/composable_kernel commit: ffb52783d0 ]
2025-06-04 11:46:28 +03:00
Khushbu Agarwal
42ace38c07
Rotating buffer PR CI fix ( #2257 )
...
* Revert "Revert "[CK_tile] Add rotating buffer feature for universal gemm (#2200 )" (#2256 )"
This reverts commit 2c31e1e716b20a268cc6ffca4af7cc5ecbe44e3f.
* fix regression
[ROCm/composable_kernel commit: 2e38eb4f1c ]
2025-06-02 10:25:01 -07:00
Illia Silin
fa9625d940
Revert "[CK_tile] Add rotating buffer feature for universal gemm ( #2200 )" ( #2256 )
...
This reverts commit b021b5f1d3ae599305e0b455035a6e01ad81fe23.
[ROCm/composable_kernel commit: bbdaf79a52 ]
2025-05-28 09:46:52 -06:00
Khushbu Agarwal
2ca6f22fab
[CK_tile] Add rotating buffer feature for universal gemm ( #2200 )
...
* Add rotating buffer feature for universal gemm
* adding changes in tile_engine
* Updated code to merge kernel_launch
* removing comments
* Enable rotating buffer changes to flatmm
* Created diff launch_kernel function for rotating buffer
* Simplfied calculation using macros
* merge code with new changes in tile_engine
* clang formatted
* Redefine macros
[ROCm/composable_kernel commit: 99857e10e6 ]
2025-05-27 23:00:58 -07:00
Gino Lu
983dac1699
[CK-Tile] warp-gemm support for using V_MFMA_F32_16x16x32_BF16 ( #2073 )
...
* draft v_mfma_f32_16x16x32_bf16
* fix error config and add debug code.
* Solve the CShuffle Problem
* draft v_mfma_f32_16x16x32_bf16
* fix error config and add debug code.
* Solve the CShuffle Problem
* fix error while testing new command
* Finished the feature of new mfma 16*16*32
* Addressed the comment
---------
Co-authored-by: ThomasNing <thomas.ning@amd.com >
[ROCm/composable_kernel commit: 504f563f78 ]
2025-04-22 15:52:36 -07:00
jakpiase
addcd203eb
[CK_TILE] Add 2:4 structured sparsity support for fp16 gemm ( #1957 )
...
* add structured sparsity fp16 support for gemm
* added reviewer suggestions
* update changelog
* update changelog
* add reviewers suggestions
* Minor fix
* clang fix
* fix doxygen
[ROCm/composable_kernel commit: 6c61f4d237 ]
2025-04-11 12:18:26 +02:00
kylasa
7fbcd06a62
Addressing (Post Merge) code review comments for PR 1845 ( #1883 )
...
* Addressing code review comments.
* Addressing code review comments.
* Reorganized code for better readability.
* add ck_tile gemms for new types in CI
* fix jenkins syntax
* fix script syntax
* Add the test cases back
* Address the review comments
* Address review comments
* clang format
* Solve the merging issues
* Addressed the comments
* clang format
---------
Co-authored-by: illsilin <Illia.Silin@amd.com >
Co-authored-by: ThomasNing <thomas.ning@amd.com >
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com >
[ROCm/composable_kernel commit: 66c5f5b0b6 ]
2025-03-06 11:40:30 -08:00
Bartłomiej Kocot
c5acb522de
[CK TILE] Gemm pk_int4_t permute B ( #1907 )
...
* [CK TILE] Gemm pk_int4_t permute B
* Fixes
[ROCm/composable_kernel commit: 0356ee069e ]
2025-02-27 11:01:14 +01:00