Files
composable_kernel/test/ck_tile/grouped_gemm_multi_d
Aviral Goel 01d37b171d Increase tolerance for FP16 GEMM tests to handle non-deterministic ro… (#4335)
…unding

Three tests were failing intermittently with small errors (0.01-1.5%)
due to non-deterministic FP16 accumulation order from GPU thread
scheduling:
- test_ck_tile_batched_gemm
- test_ck_tile_grouped_gemm_preshuffle
- test_ck_tile_grouped_gemm_multi_d

These tests use kbatch=1 (no split-K), so errors are from
order-dependent rounding, not atomics. Increased tolerances from 1e-3 to
2e-3 (0.2%) to account for FP16 precision limits while still catching
real bugs.


- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2026-02-06 16:14:28 -08:00
..