* [CK TILE] Add split K support in GEMM
* Updates
* Fixes
* rebase
* fix
* Fix
* fixes
* support for batched gemm
[ROCm/composable_kernel commit: af66494880]
* add IsSupportedArgument to gemm_kernel
* add ut and do some refactoring
* switched to ck_tile's integral_constant
[ROCm/composable_kernel commit: feb9a2bd9b]
* Ck-tile, impl. grouped gemm
* Workspace is allocated by user, and is passed to the function
* Prepare test to new api design
* Unify GemTransKernelArgs, removing N0 param
* Add 1 to dim3 in paritioner
* Typo: gem - > gemm
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
[ROCm/composable_kernel commit: 4cb3d7d7ea]
* add interwave scheduler for gemm mem pipeline
* Fix merge artifacts.
* Refactor unit tests.
* Switch to interwave scheduler for mem example
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
Co-authored-by: Adam Osewski <Adam.Osewski@amd.com>
[ROCm/composable_kernel commit: e7b6286441]
* Finished the feature
* Modified the test file
* Test case update
* addresss comment
* Addressed the review comment
* Fixed the CI error
[ROCm/composable_kernel commit: 2b6458ddf2]
* CK-Tile GEMM with memory bound pipeline.
* Memory bound gemm pipeline.
* Fix not closed namespace.
* Block gemm mem pipeline draft.
* Do not use ck_tile:: within ck_tile namespace.
* Refactoring & Move Layout info to pipeline problem.
* Get hot loop and TailNum information before lunching kernel.
* Fixes in pipeline.
* Add comment to load_tile_raw and change variable naming style.
* Few small changes & formatting.
* Do not use macro.
* Add gtests.
* Use AccDataType for Output of MFMA instruction.
* Formatting.
* Refactor gemm examples.
* Switch over to current block gemm.
* Use currently available pipeline policy.
* Refactoring and review comment.s
* Fixes after merge.
* Add missing include.
* Add load tile overload which accepts output tensor as parameter.
* This give 8% perf boost at the cost of using more registers.
* Rename example.
* Small changes.
* Fix compilation err and lower K.
* Support different layouts for A/B
* Fix vector size for different layouts.
* Rename Alignment into VectorSize
* Unblock tests.
[ROCm/composable_kernel commit: 24d996aae1]
* [CK_TILE] Image to Column kernel
* Fixes
* Vector loads and stores
* Fixes
* Fixes
* change test dir name
[ROCm/composable_kernel commit: de3e3b6424]