[CK_TILE] Normalize gpu_target before LDS_SIZE_MAP lookup
(#5438)
GPU targets passed with feature suffixes (e.g. `gfx950:xnack+`) were
falling through to `DEFAULT_LDS_SIZE` instead of matching their entry in
`LDS_SIZE_MAP`, potentially causing incorrect tile acceptance/rejection.
## Changes
- **`gemm_validation_utils.py`**: Strip everything after `:` from
`gpu_target` before the `LDS_SIZE_MAP` lookup; use the normalized base
arch name in the error message as well.
```python
# Before
hw_lds_size = LDS_SIZE_MAP.get(gpu_target, DEFAULT_LDS_SIZE)
# After
base_gpu_target = gpu_target.split(":")[0] if gpu_target else gpu_target
hw_lds_size = LDS_SIZE_MAP.get(base_gpu_target, DEFAULT_LDS_SIZE)
```
[CK_TILE] Add pooling in tile_engine (#4469)
## Motivation
<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->
Add pooling in ck tile engine
## Technical Details
<!-- Explain the changes along with any relevant GitHub links. -->
## Test Plan
<!-- Explain any relevant testing done to verify this PR. -->
## Test Result
<!-- Briefly summarize test outcomes. -->
## Submission Checklist
- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>