## Motivation
This PR is part of the [WMMA/MFMA] unification work. It's the first of
the series of PRs that add all the necessary MMA builtins as a
`amdgcn_mma` structs.
## Technical Details
This change adds new specializations for WMMA dense builtins. In total,
we have now 9 RDNA4 builtins and 3 RDNA3 builtins.
## Test Plan
All the new wrappers were added to the test suite in
`test_amdgcn_mma_layout.inc`.
## Test Result
Test pass locally, waiting for the CI.
## Submission Checklist
- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Yung-sheng Tu <yung-sheng@streamhpc.com>