Illia Silin
0f4d68633b
Revert "fix the flatmm ( #2349 )" ( #2352 )
...
This reverts commit fc65195605 .
[ROCm/composable_kernel commit: 5523df4b2d ]
2025-06-16 07:54:55 -07:00
Thomas Ning
fc65195605
fix the flatmm ( #2349 )
...
[ROCm/composable_kernel commit: d996bc78be ]
2025-06-16 02:17:53 -07:00
Khushbu Agarwal
7afee6c536
fix flatmm kernel for bigger size for fp16 datatype ( #2302 )
...
[ROCm/composable_kernel commit: bd270fe4bc ]
2025-06-10 11:13:40 -07:00
Khushbu Agarwal
2b6621fba8
Rotating buffer PR CI fix ( #2257 )
...
* Revert "Revert "[CK_tile] Add rotating buffer feature for universal gemm (#2200 )" (#2256 )"
This reverts commit 7baac527a1 .
* fix regression
[ROCm/composable_kernel commit: 2e38eb4f1c ]
2025-06-02 10:25:01 -07:00
Illia Silin
7baac527a1
Revert "[CK_tile] Add rotating buffer feature for universal gemm ( #2200 )" ( #2256 )
...
This reverts commit 0f77aa335d .
[ROCm/composable_kernel commit: bbdaf79a52 ]
2025-05-28 09:46:52 -06:00
Khushbu Agarwal
0f77aa335d
[CK_tile] Add rotating buffer feature for universal gemm ( #2200 )
...
* Add rotating buffer feature for universal gemm
* adding changes in tile_engine
* Updated code to merge kernel_launch
* removing comments
* Enable rotating buffer changes to flatmm
* Created diff launch_kernel function for rotating buffer
* Simplfied calculation using macros
* merge code with new changes in tile_engine
* clang formatted
* Redefine macros
[ROCm/composable_kernel commit: 99857e10e6 ]
2025-05-27 23:00:58 -07:00
Aviral Goel
9c99bdede7
Add catch blocks in example GEMM apps to enable better error handling (Issue: 1928) ( #2234 )
...
* added catch statements to examples
* clang format
[ROCm/composable_kernel commit: c52649ad57 ]
2025-05-27 22:32:42 -07:00
BingYuan.Zhou
977b7d0928
Flatmm merge ( #2168 )
...
* sync with function interface of cshuffleepiloge,fix flatmm build fail
* move code from solin/flatmm which add mfma16*16*32fp8 and optimize flatmm
---------
Co-authored-by: solin <bingzhou@amd.com >
[ROCm/composable_kernel commit: 6a3960c1e1 ]
2025-05-08 12:59:57 +08:00
BingYuan.Zhou
4ec293cb4b
[flatmm] implement basic fp16 flatmm ( #2089 )
...
* [flatmm] implement basic fp16 flatmm
* fix CI build fail
---------
Co-authored-by: root <root@hjbog-srdc-50.amd.com >
Co-authored-by: solin <bingzhou@amd.com >
[ROCm/composable_kernel commit: eaf1f0bf3b ]
2025-04-16 16:51:17 +08:00