[CK_TILE] Fix MMA layout test to match amdgcn_mma OpFamily
parameter (#5222)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
## Summary
- PR #4837 added `MmaOpFamily OpFamily_` as a new template parameter to
`amdgcn_mma` and `MmaDefaultSelector`, but the MMA layout test (PR
#4495) was not updated to include it
- Add the missing `OpFamily_` parameter to all three `RegisterMapTraits`
partial specializations (gfx9, gfx11, gfx12) and all
`MmaDefaultSelector` usages
- Fixes build failure: `template argument for non-type template
parameter must be an expression`
## Test plan
- [x] Verified test compiles cleanly with ROCm 7.1.1 clang++ targeting
gfx90a
- [x] `test_amdgcn_mma_layout` gfx90a (MFMA): PASSED
- [x] `test_amdgcn_mma_layout` gfx1201 (WMMA): SKIPPED (no device)
- [x] `test_amdgcn_mma_layout` gfx1100 (WMMA): SKIPPED (no device)
- [x] CI validation on all GPU targets
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Adding layout test for amdgcn_mma structs
## Motivation
Currently, the test suite for `amdgcn_mma` focuses on the design (e.g.
choosing the correct specialization based on SFINAE) and a single live
test that checks if selected MmaOp runs. This PR adds a simplified GEMM
test kernel that checks the exact layout of the selected MmaOp.
## Technical Details
The test in `test_amdgcn_mma_layout.cpp` launches MxKxN test cases (one
per block), where each case:
1. Constructs A and B tensors on a device with a single 1 at A(m,k) and
B(k,n) (rest is all 0s)
2. Executes the MMA intrinsic.
3. Checks if C has the "1" on the excpeted position.
For the MMA instrinsic, it pulls a Mma op from amdgcn_mma specialization
based on a given input (tile dimension, data types).
Note 1: As a helper, in `test_amdgcn_mma_layout_util.hpp` we add
register map for a given amdgcn_mma specialization. Register mapping is
currently based on the `tile_distribution_encoding`.
Note 2: Everything is added to the test suite, no additions to the
actual `amdgcn_mma` structs. All the extra information that is needed,
but not yet provided by `amdgcn_mma` structs, is added as a boilerplate
to the header. TODO: Rebase this PR on top of the `amdgcn_mma` refactor
or clean it up after merge.
## Test Plan
This PR solely adds a new test to the existing code.
## Test Result
Tests pass.
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
[CK TILE] Unification of sparse MFMA/WMMA policy structs
(#4837)
## Motivation
The existing unification work supports DENSE intrinsics. In this PR we
enable support for SPARSE as well as SCALE intrinsics and add an example
SPARSE implementation.
## Technical Details
Mostly trivial changes. One framework change is that the desired
`MmaOpFamily` is passed to the `MmaDefaultSelector`. As my relevant
commit explains, we do not support a fallback family at the moment, but
it is something we can consider.
## Test Plan
Added a new test for the relevant sparse specializations.
## Test Result
Test should pass.
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
* First look at mfma / wmma unification
* Refactor
* Re-org file structure
* Restructure transform selection and WaveWiseMma class
* Update license files. Add missing gfx1151 support. Change wave size for HOST to 1. Update datatypes naming consistency
* Fixes default MmaSelector implentation
* Adds unit tests for amdgcn_mma and arch
* Consolidate common arch id checks to constexpr functions. Strongly type ids as amdgcn_target_arch_id object.
* Refactor is_any_value_of
* Fixes mma_selector logic
* Fix typo
* Add mma selector test for tile decomposition
* Fix compilation of mma.hpp
* Revert back to c++17 compatibility
* Fix compiler error by returning index_t from get_warp_size()
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Fixes compiler error for missing is_wave32() function
* Fixes compiler error for host wave_size() should be 64
* Fixes compiler errors where __cpp_concepts is not defined
* Fixes compiler errors where __cpp_concepts is not defined
* Fix test failure for host is wave64 by default
---------
Co-authored-by: Chris Millette <you@example.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>