mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
* Refactor GPU verification kernel to gather erorr stats on GPU
* Check if result is all zero
* non-negative error count doesn't need custom Atomics
* Remove unnecessary AtomicMaxFloat function
* Simpler warp reduction, remove passed flag
* Move verification header to include
* Fix header path in test
* Fix block reduction loop
[ROCm/composable_kernel commit: f173642087]
27 KiB
27 KiB