[rocm-libraries] ROCm/rocm-libraries#5323 (commit 5454e9e)

CK Tile MX GEMM Packing Improvement

## Motivation

Reduce the scale loading size and also has better utilization of MFMA
scale selection.

## Technical Details

Add up the packing of mx scales.

## Test Plan

Use the existing test cases.

## Test Result

<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Thomas Ning
2026-03-17 18:58:56 +00:00
committed by assistant-librarian[bot]
parent 859acb5ae7
commit 5f90f69795
9 changed files with 399 additions and 130 deletions

View File

@@ -134,7 +134,12 @@ struct ABQuantGemmPipelineAgBgCrEightWaves : public BaseGemmPipelineAgBgCrCompV3
CK_TILE_HOST_DEVICE static constexpr index_t GetSmemSize()
{
return Policy::template GetSmemSize<Problem>();
// We are not storing the original packed type in LDS, so we need to multiply the smem size
// by the packed size.
constexpr index_t smem_size_a = Policy::template GetSmemSizeA<Problem>() * APackedSize;
constexpr index_t smem_size_b = Policy::template GetSmemSizeB<Problem>() * BPackedSize;
return 2 * (smem_size_a + smem_size_b);
}
CK_TILE_HOST static std::string Print() { return "ABQuantGemmPipelineAgBgCrEightWaves\n"; }