mirror of
https://github.com/NVIDIA/nvbench.git
synced 2026-05-26 16:04:46 +00:00
Ensure that measure_cold::run_warmup instantiates blocking kernel
Because warm-up runs are executed without use of blocking kernel, the blocking kernel was not jitted until actual measurements were collected. The module loading cost incurred during the first run shows as elevated CPU time noise value for the first measurement as noted in https://github.com/NVIDIA/nvbench/pull/339 This PR adds `this->block_stream(); this->unblock_stream();` prior to executing warm-up loop with use of blocking kernel disabled. This ensures that blocking kernel is instantiated during the warm-up, but it no other kernel is launched between its launch and stream sync thus avoiding deadlocking.
This commit is contained in:
@@ -249,6 +249,11 @@ private:
|
||||
return;
|
||||
}
|
||||
|
||||
// Ensure blocking kernel is loaded during the warmup
|
||||
// Ref: https://github.com/NVIDIA/nvbench/issues/339
|
||||
this->block_stream();
|
||||
this->unblock_stream();
|
||||
|
||||
// disable use of blocking kernel for warm-up run
|
||||
// see https://github.com/NVIDIA/nvbench/issues/240
|
||||
constexpr bool disable_blocking_kernel = true;
|
||||
|
||||
Reference in New Issue
Block a user