Commit Graph

6 Commits

Author SHA1 Message Date
jakpiase
0e91d32c61 [CK_TILE] Switch to universal gemm for batched and grouped gemms (#1919)
* switch to universal gemm for batched and grouped gemms

* added reviewer comments

* fixed grouped gemm tests
2025-03-20 11:17:04 +01:00
Bartłomiej Kocot
0356ee069e [CK TILE] Gemm pk_int4_t permute B (#1907)
* [CK TILE] Gemm pk_int4_t permute B

* Fixes
2025-02-27 11:01:14 +01:00
Adam Osewski
39dc25a9b8 [CK-Tile] Enable vectorized reads on all layouts & improve perf. (#1835)
* Refactor universal gemm policy.

* Adapt example to refactor changes.

* Introduce static encoding pattern

* Adding shuffled encoding patterns.

* Fix err in reverse tuple.

* Add transpose_tile2d

* Small refactoring + doc

* Enable reading on contiguous dimension in all layouts.

* Transpose A/B register tile if needed for comp v3 pipeline.

* Take contiguous dim size when calculating dram vector load size.

* A/B smem pack size taken from WarpGemm attributes

* Update B LDS layout and setup tile distribution pattern at class level.

* Fix static assert.

* Fix errors in examples.

* Formatting & fix IsTranspose

* Fix VectorSize & refactor.

* Add error loging messages.

* Fix VecLoadSize and TranspseC for mem pipeline.

* Update unit-tests & disable mem pipeline.

* Clang format

* Update include/ck_tile/core/tensor/tile_window.hpp

Co-authored-by: jakpiase <jakub.piasecki@amd.com>

* Fix compilation and reviewers comments.

* Refactor unit-test. Fallback to non-universal gemm.

Need to use GemmPipelineAGmemBGmemCRegV1 for now,
since GemmKernel is now supporting also non-K major vector reads.

---------

Co-authored-by: jakpiase <jakub.piasecki@amd.com>
2025-01-27 16:37:19 +01:00
Mateusz Ozga
3c93d3c444 CK-Tile Grouped GEMM refactor and post PR fixes (#1756)
* Grouped gemm simple code refactor

* Offset invoker

* Invoke generic Run, and replace name of parrtitioner variable

* Tests fix type

* Removed namespaces

* Add template param to avoid implicit cast

* Remove generic function

* Constant value

* underline enum to int16_t

* Generalize partitioner function

* Remove whitespaces

* Rename function

* Using support

* Clang-format

* Clang-format

* Fn-partitioner description fn

* Typo

* Typo 2

* Better description

* Better description

* Refactor after review

* Use ctr instead of set fn

* Inovke ctr and typo

* Comments

* Remove unnecessary comment

* Review, remove modulo
2025-01-21 21:06:10 +01:00
Mateusz Ozga
e758d006a5 Apply Ck-tile argument parser for vectors [I/O] (#1758)
* Parser for a vector was added. Additionaly we valid correctnes of numbers

* Remove unnecessary comments

* Review part 1

* Review part 2

* Add const to variadic lambda

* Rename C->K
2024-12-19 17:55:35 +01:00
Mateusz Ozga
4cb3d7d7ea Ck tile grouped GEMM example (#1713)
* Ck-tile, impl. grouped gemm

* Workspace is allocated by user, and is passed to the function

* Prepare test to new api design

* Unify GemTransKernelArgs, removing N0 param

* Add 1 to dim3 in paritioner

* Typo: gem - > gemm

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-12-04 21:40:01 +01:00