Commit Graph

8 Commits

Author SHA1 Message Date
Aviral Goel
6e774b512a fix(copyright header): add header to missing files (#2807)
[ROCm/composable_kernel commit: f3239395dc]
2025-09-11 12:27:08 -07:00
Mateusz Ozga
e4010d5ea1 [CK-TILE] Default2DEpilogue, example and adding nullptr_t type for D (#2752)
* Init commit

* Quick fix, CI fails

* Remove CDElementWise

* Add CDEELementWise

---------

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>

[ROCm/composable_kernel commit: 0758883fa4]
2025-08-28 12:45:50 -07:00
John Afaganis
cef79d5f82 Revert "[CK-TILE] Default epilogue, adding support for D (#2629)" (#2746)
This reverts commit 92037686ae.

[ROCm/composable_kernel commit: 508e7912f9]
2025-08-26 09:48:49 -07:00
Mateusz Ozga
92037686ae [CK-TILE] Default epilogue, adding support for D (#2629)
* Extend 2d-epilogue, D support

* Added tests & update

* Remove unused attribute

* Extend tests

---------

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>

[ROCm/composable_kernel commit: d43228fbca]
2025-08-25 19:29:35 -07:00
linqunAMD
615ca9842d Support Wave32 in CK_TILE - Part 1 (#2594)
* Support wave32/wave64 in CK_TILE - Part 1

* remove blocksize in kernel launch

* fix build error

* fix clang format

* fix clang format 2

* fix clang format 3

* fix fmha build error

* fix fmha build 2

* fix fmha build 3

* fix build error 4

* address review comment

* update change log

* replace KernelBlockSize with kBlockSize

* fix CI fail

* fix clang format

* address review comment and rebase code.

* fix universal test fail

---------

Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>

[ROCm/composable_kernel commit: 9fcc1ee9fd]
2025-08-18 10:08:31 -07:00
Mateusz Ozga
0c0fd440ca [CK_TILE] Introduces a new GEMM API that splits the existing basic GEMM class into multiple specialized classes. (#2520)
* Init commit new API

* apply clang-format

* PreShuffle preapring

* Apply Preshuffle condition to universal_gemm

* Fix: convert size_t to index_t

* Review changes

* Mode 100755 -> 100644

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: b507d889c1]
2025-07-24 20:39:56 +02:00
Thomas Ning
5c2009c852 fix the mi350 error (#2378)
[ROCm/composable_kernel commit: df6023e305]
2025-06-20 12:50:13 -07:00
Mateusz Ozga
6b3ddd0e23 [CK_TILE] Multiple-D GEMM example (#2219)
* Multiple d, initial commit

* Check Ds Layout

* Readme and clang format

* Update branch & conflicts

* Multiple D - fix clang-formatter

* Rename elemetwise_op

* Fix CI

* Code review part1

* Remove printf

* Remove unnecessary comment

* Add new tests with Col layout

* Review part 2

* Added support for Multiple D GEMM

* Update comment

* Remove maybe_unused

* Clang-format

* Review part 3

* Add comment to function

* Add comment to function: another

* Take number of params for a refrence function

* Remove additional d param for 0 tensor

* Change name of function

* Fix CI fails

[ROCm/composable_kernel commit: bd96ac9742]
2025-06-13 19:39:11 +02:00