* Create and use NVBENCH_CUDA_CALL_RESET_ERROR.
* Moved cudaGetLastError() call to NVBENCH_CUDA_CALL macro
---------
Co-authored-by: Sergey Pavlov <psvvsp89@gmail.com>
* Refactor main implementation to improve reusability and customization.
Move the implementation of `main` out of macros and into separate
functions. This allows for easier reuse and customization of the macros.
Existing macro usage should still work as expected, and new
customization points will simplify common tasks like argument parsing
going forward.
* Add tests that validate common main customizations.
Previously, convergence was tested by waiting for the relative stdev
of cuda timings ("noise") to drop below a certain percentage
(`max_noise`).
This assumed that all benchmarks would eventually see their noise drop
to some threshold, but this is not the case. In practice, many benchmarks
never converge to the default 0.5% relative stdev and instead will always
run to the 15s timeout -- even if the means have converged in a second
or two.
Added a new check that tests when the noise itself stabilizes and ends
the benchmark, even if noise > max_noise.
After testing, this patch alone significantly reduces the runtime of the
Thrust+CUB benchmark suite (from 30 hours to 5 hours) and produces similar
timing results.
The parameters used to tune this feature are not exposed -- if this
approach works long-term and there's a strong motivation to let users
tweak them, then we can worry about names/APIs/CLI/docs later.
- Add export sets
- Add install rules
- Remove manual CPM import, port to rapids_cpm_*, etc
- Organize CMake code into cmake/*.cmake files.
- NVBench is now a shared library.
This avoids an annoying case where the max value is dropped due to
rounding errors.
Adds a few other missing test cases for `nvbench::range`, too.
Fixes#3.
- `enum_type_axis` simplifies using integral_constants with type axes.
- `examples/enums.cu` demonstrates various ways of implementing parameter
sweeps with enum types.
Refactor nvbench::state to use this for axis parameters.
These will also be useful for summaries and measurements.
Also adds a new ASSERT_THROWS_ANY macro to test some of the new API.