The string used when constructing a summary is no longer a human
readable name, but rather a tag string (e.g. "nv/cold/time/gpu/mean").
These will make lookup easier and more stable going forward.
name vs. short_name no longer exists. Now there is just "name", which
is used for column headings. The "description" string may still be
used for detailed information.
Updated the json tests and compare script to reflect these changes.
Previously, convergence was tested by waiting for the relative stdev
of cuda timings ("noise") to drop below a certain percentage
(`max_noise`).
This assumed that all benchmarks would eventually see their noise drop
to some threshold, but this is not the case. In practice, many benchmarks
never converge to the default 0.5% relative stdev and instead will always
run to the 15s timeout -- even if the means have converged in a second
or two.
Added a new check that tests when the noise itself stabilizes and ends
the benchmark, even if noise > max_noise.
After testing, this patch alone significantly reduces the runtime of the
Thrust+CUB benchmark suite (from 30 hours to 5 hours) and produces similar
timing results.
The parameters used to tune this feature are not exposed -- if this
approach works long-term and there's a strong motivation to let users
tweak them, then we can worry about names/APIs/CLI/docs later.
This will provide functionality such as clock locking (--lgm),
persistance mode (--pm), device querying (--list), version checking
(--version), and documentation (--help).
This is possible already with any nvbench executable, but having
one with a reliable name will be helpful for scripting and writing
documentation.
- /W4 on MSVC
- -Wall -Wextra + others on gcc/clang
- New NVBench_ENABLE_WERROR option to toggle "warnings as errors"
- Mark the nlohmann_json library as IMPORTED to switch to system includes
- Rename nvbench_main -> nvbench.main to follow target name conventions
- Explicitly suppress some cudafe warnings when compiling templates in
nlohmann_json headers.
- Explicitly suppress some warnings from Thrust headers.
- Various fixes for warnings exposed by new flags.
- Disable CUPTI on CTK < 11.3 (See #52).
- Add export sets
- Add install rules
- Remove manual CPM import, port to rapids_cpm_*, etc
- Organize CMake code into cmake/*.cmake files.
- NVBench is now a shared library.
Human-readable outputs (md) and CLI inputs still use percentages.
In-memory and machine-readable outputs (csv, json) use ratios.
This is the convention that spreadsheet apps expect. Fixes#2.
This avoids an annoying case where the max value is dropped due to
rounding errors.
Adds a few other missing test cases for `nvbench::range`, too.
Fixes#3.
- `enum_type_axis` simplifies using integral_constants with type axes.
- `examples/enums.cu` demonstrates various ways of implementing parameter
sweeps with enum types.
- New NVBench_ENABLE_EXAMPLES CMake option.
- examples/axis.cu provides examples of parameter sweeps.
- Moves testing/sleep_kernel.cuh -> nvbench/test_kernels.cuh
- Accessible to examples and provides some built-in kernels for users
to experiement with.
- Not included with `<nvbench/nvbench.cuh>`.
Also cleaned up the annoying quirk where `set_type_axes_names` *had*
to be called on all benchmarks with type axes.
Default names are {"T", "U", "V", "W"} for up-to four type axes. For
five or more, {"T0", "T1", ...} is used instead.
s/NVBENCH_CREATE/NVBENCH_BENCH/g
s/NVBENCH_BENCH_TEMPLATE/NVBENCH_BENCH_TYPES/g
This will fit nicer once the exec_tags version are added:
NVBENCH_BENCH
NVBENCH_BENCH_TYPES
NVBENCH_BENCH_FLAGS
NVBENCH_BENCH_TYPES_FLAGS
The old implementation was scattered and ad hoc. This one is slightly
less so.
More importantly, refactoring to this design will make it easier to
add device traversal.
- Add support for single values ("Axis=Value").
- Make other value specs shell friendly:
- Range: "Axis:(2:10:2)" -> "Axis=[2:10:2]"
- List: "Axis:{2,3,4,5}" -> "Axis=[2,3,4,5]"
- ":" -> "=" feels more natural
- "{}()" characters have special meaning in bash.
- "[]" character don't require escapes.
- Using the same braces for both ranges/list is easier to remember,
only the delimiter changes.