From f6eb5f0a6abbd664d60f4c3382796cd8a9f350b0 Mon Sep 17 00:00:00 2001 From: Estevan Vedovelli Date: Tue, 14 Apr 2026 16:15:17 +0000 Subject: [PATCH] [rocm-libraries] ROCm/rocm-libraries#6379 (commit b38b056) [ck] Clamp negative kernel execution elapsed time to zero (#6379) ## Motivation hipEventElapsedTime can return a small negative value on Windows when timing a very fast kernel launch on the null stream. This caused consumers of launch_and_time_kernel to receive a negative elapsed time, which they reasonably treat as an error, breaking otherwise-correct kernel executions. ## Technical Details After calling hipEventElapsedTime, a clamp is applied in launch_and_time_kernel before the result is returned, avoiding the return of a physically impossible elapsed time. The negative value from hipEventElapsedTime has been observed on Windows. For kernels that complete in well under a millisecond, the HIP event timestamps can alias such that the computed difference is a small negative number (observed: ~-1.78 ms). No HIP error is reported by any surrounding call (hipEventRecord, hipEventSynchronize, hipGetLastError), confirming the kernel itself executed successfully. ## Test Plan - Recompile CK and validate no kernel execution reports a negative elapsed time during hipTensor tests. - Pass the CI/CD pre-checking tests for CK. ## Test Result - All tests passing ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. --- include/ck/host_utility/kernel_launch.hpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/include/ck/host_utility/kernel_launch.hpp b/include/ck/host_utility/kernel_launch.hpp index 1da4f16ca3..72ec047ebc 100644 --- a/include/ck/host_utility/kernel_launch.hpp +++ b/include/ck/host_utility/kernel_launch.hpp @@ -70,6 +70,11 @@ float launch_and_time_kernel(const StreamConfig& stream_config, hip_check_error(hipEventElapsedTime(&total_time, start, stop)); + // hipEventElapsedTime can return a small negative value on Windows for a + // very fast kernel. Clamp to zero, as negative elapsed time is never physical. + if(total_time < 0) + total_time = 0; + hip_check_error(hipEventDestroy(start)); hip_check_error(hipEventDestroy(stop));