mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
Refactor function amd_buffer_load_invalid_element_return_zero to avoid
the inefficient ASM code generated by compiler.
Compiler generates suboptimal assembly for ternary operator, causing excessive VGPR usage
Tested compilers:
- Rocm 7.0.1
- Rocm 7.1.1
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
[ROCm/composable_kernel commit: d7497d2694]