mirror of https://github.com/ROCm/composable_kernel.git synced 2026-06-29 19:28:33 +00:00

Files

Kiefer van Teutem 2089713f94 [rocm-libraries] ROCm/rocm-libraries#8227 (commit 75c30d5)

=?UTF-8?q?[CK=20TILE]=20Unification=20Work=20=E2=80=93=20?=
 =?UTF-8?q?Remove=20unification=20Flag=20structs=20in=20favor=20of=20new?=
 =?UTF-8?q?=20WarpGemmParams=20(#8227)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

## Motivation

Recently, the way flags are sent down to the intrinsics was changed in
CK Tile. At the point where the WarpGemm is invoked, an arbitrary number
of template parameters can be passed, and these are passed down all the
way to the lowest level intrinsics wrappers. Here
`WarpGemmParamsParser<>` is used to extract flags for the intrinsics.

In this MR we adapt the the unification framework (amdgcn_mma struct and
MmaPipelines) to work in the same way. By doing this, there is no longer
a point in our custom intrinsic Flag structs, so these are removed.

Unrelated but I also tried removing the MmaPipeline flags because they
arn't used for anything except CTranspose, which is already available.
This also make test_amdgcn_mma_pipeline completely redundant so removed
that as well.

2026-06-26 12:00:58 +00:00

algorithm

[rocm-libraries] ROCm/rocm-libraries#7760 (commit a61bc76)

2026-05-27 06:56:58 -07:00

arch

[rocm-libraries] ROCm/rocm-libraries#8227 (commit 75c30d5)

2026-06-26 12:00:58 +00:00

container

[rocm-libraries] ROCm/rocm-libraries#6768 (commit 43ca43f)

2026-06-05 12:27:41 +00:00

numeric

[rocm-libraries] ROCm/rocm-libraries#6768 (commit 43ca43f)

2026-06-05 12:27:41 +00:00

tensor

[rocm-libraries] ROCm/rocm-libraries#6565 (commit d41715e)

2026-06-19 06:57:14 +00:00

utility

[rocm-libraries] ROCm/rocm-libraries#7760 (commit a61bc76)

2026-05-27 06:56:58 -07:00

config.hpp

[rocm-libraries] ROCm/rocm-libraries#6768 (commit 43ca43f)

2026-06-05 12:27:41 +00:00

README.md

introducing ck_tile! (#1216 )

2024-04-15 19:27:12 -05:00

README.md

ck_tile/core

ck_tile/core contains every basic functions and structures to create a GPU kernel using ck_tile. User should only include ck_tile/core.hpp this single header to use all the functionality. Everything is under ck_tile namespace. The coding style under this folder should be similar to std (snake_case for structure/function, Camel for template types...)

algorithm/
    coordinate transform and some other reusable algorithm
arch/
    contains some basic device building block like mma, buffer addressing, etc...
container/
    contains basic container data structure, array/sequence/tuple/...
numeric/
    data type, and data type related math
tensor/
    tensor descriptors and tile level API
utility/
    other utility function for both host/device