Files
composable_kernel/tile_engine
Thrupti Raj Lakshmana Gowda 8c42eaf3f8 [CK Tile] Fix architecture-dependent EightWave assignment in cshuffle_epilogue (#6102)
## Motivation
Tile engine CI build on the develop branch started failing after a
recent change(https://github.com/ROCm/rocm-libraries/pull/5218) in
`cshuffle_epilogue.hpp`. The `EightWave` constant was unconditionally
computed as `(MWave * NWave == 8)` for all architectures, but this logic
is only valid for gfx9*. On other architectures (e.g., gfx1201),
`EightWave` must always be `false`, otherwise it leads to incorrect
`BlockedXDLN_PerWarp` computation and build failures.

## Technical Details
In `cshuffle_epilogue.hpp`, the `EightWave` static constexpr was set as:
```cpp
static constexpr bool EightWave = (MWave * NWave == 8);
```
This was applied regardless of the target GPU architecture. The fix uses
a preprocessor guard to make this architecture-aware:
- **gfx9* (`__gfx9__`):** `EightWave` is evaluated as `(MWave * NWave ==
8)` — true or false depending on the wave configuration
- **All other architectures:** `EightWave` defaults to `false`

## Test Plan
- Tile engine CI build on develop branch

## Test Result
- *Pending CI*

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---
> **Note:** This PR supersedes ROCm/rocm-libraries#5436, which is
blocked pending a review approval from a reviewer currently on PTO. The
same changes have been applied to this branch
(`users/tlakshma/ck/develop-clone`) to allow merging.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Max Podkorytov <4273004+tenpercent@users.noreply.github.com>
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2026-04-02 20:55:03 -06:00
..