Files
composable_kernel/include/ck_tile/ops/gemm
Emily Martins 071165919f [CK Tile] Stream K GEMM Kernel HostArgs and Kernel Classes (#2681)
* CK Tile Stream K Device Ops

Implementation of CK Tile StreamKHostArgs and StreamKKernel classes. The
StreamKKernel class injects Universal Gemm and includes functions to
facilitate kernel preparation for the GPU.

* Stream K Device Ops Fixes

- Update GetWorkSpaceSize to call TilePartitioner's GetWorkSpaceSize to
  ensure we get size needed for accumulation buffers and semaphores.
- Pass in num_sk_blocks into TilePartitioner constructor
- Update documentation

* Add WarpTile dimensions to GetName function in StreamKKernel class

* Fix typos in StreamKHostArgs class description.

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>

* Apply clang format on updated comment for StreamKHostArgs

* Explicitly specify type for StreamKReductionStrategy enum

* Remove unecessary scopes

* Unify the commenting style to inline comments

* Add explicit casts for occupancy and num_cu in MakeKernelArgs function

Both the static functions Occupancy and NumCU in the StreamKKernel class
use functions from the HIP API that result in the returned occupancy and
num_cu types being type int. The TilePartitioner interface for stream K will
have occupancy and num_cu being type ck_tile::index_t which is int32_t.
Thus, to be safe, this change ensures that both occupancy and num_cu are
cast to int32_t.

* Fix use of kentry due to interface update
PR #2594 updated the interface for the kentry function in
include/ck_tile/host/kernel_launch.hpp. As a result, the static function
Occupancy was updated to work correctly with the new interface.
PR #2594 also changed UniversalGemmKernel's KernelBlockSize static
variable to kBlockSize, so the StreamKKernel class was updated to
reflect this change.

* Switch type of num_sk_blocks from uint32_t to int32_t

This change switches the type of num_sk_blocks to type ck_tile::index_t
which is int32_t. This was done because parallel work for the CK Tile
StreamK TilePartitioner's constructor will have num_sk_blocks as
ck_tile::index_t. Thus, this change will help unify the interfaces to
avoid any type conversion errors.

---------

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>
2025-08-19 15:08:52 -06:00
..
2025-08-18 01:45:40 -07:00