Files
composable_kernel/include/ck_tile/ops/epilogue
Thrupti Raj Lakshmana Gowda cf847f90ed [rocm-libraries] ROCm/rocm-libraries#6102 (commit 827fd10)
[CK Tile] Fix architecture-dependent EightWave assignment in
 cshuffle_epilogue (#6102)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

## Motivation
Tile engine CI build on the develop branch started failing after a
recent change(https://github.com/ROCm/rocm-libraries/pull/5218) in
`cshuffle_epilogue.hpp`. The `EightWave` constant was unconditionally
computed as `(MWave * NWave == 8)` for all architectures, but this logic
is only valid for gfx9*. On other architectures (e.g., gfx1201),
`EightWave` must always be `false`, otherwise it leads to incorrect
`BlockedXDLN_PerWarp` computation and build failures.

## Technical Details
In `cshuffle_epilogue.hpp`, the `EightWave` static constexpr was set as:
```cpp
static constexpr bool EightWave = (MWave * NWave == 8);
```
This was applied regardless of the target GPU architecture. The fix uses
a preprocessor guard to make this architecture-aware:
- **gfx9* (`__gfx9__`):** `EightWave` is evaluated as `(MWave * NWave ==
8)` — true or false depending on the wave configuration
- **All other architectures:** `EightWave` defaults to `false`

## Test Plan
- Tile engine CI build on develop branch

## Test Result
- *Pending CI*

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
2026-04-03 02:55:45 +00:00
..