## Motivation
Currently, the test suite for `amdgcn_mma` focuses on the design (e.g.
choosing the correct specialization based on SFINAE) and a single live
test that checks if selected MmaOp runs. This PR adds a simplified GEMM
test kernel that checks the exact layout of the selected MmaOp.
## Technical Details
The test in `test_amdgcn_mma_layout.cpp` launches MxKxN test cases (one
per block), where each case:
1. Constructs A and B tensors on a device with a single 1 at A(m,k) and
B(k,n) (rest is all 0s)
2. Executes the MMA intrinsic.
3. Checks if C has the "1" on the excpeted position.
For the MMA instrinsic, it pulls a Mma op from amdgcn_mma specialization
based on a given input (tile dimension, data types).
Note 1: As a helper, in `test_amdgcn_mma_layout_util.hpp` we add
register map for a given amdgcn_mma specialization. Register mapping is
currently based on the `tile_distribution_encoding`.
Note 2: Everything is added to the test suite, no additions to the
actual `amdgcn_mma` structs. All the extra information that is needed,
but not yet provided by `amdgcn_mma` structs, is added as a boilerplate
to the header. TODO: Rebase this PR on top of the `amdgcn_mma` refactor
or clean it up after merge.
## Test Plan
This PR solely adds a new test to the existing code.
## Test Result
Tests pass.
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Kiefer van Teutem <kiefer.vanteutem@amd.com>