The option sets m_skip_batched boolean member in benchmark_base class.
Methods `bool get_skip_batched()` and `void set_skip_batched(bool)` added.
m_skip_batched is also added to state class. Similarly named methods
are added.
CLI help file documents `--no-batched` option.
Text for --profile modified to be self-consistent, i.e., not to refer
to removed --run-once and --disable-blocking-kernel for explanantion
of what it does.
Locking clocks is currently only implemented for Volta+ devices.
Example usage:
my_bench -d [0,1,3] --persistence-mode 1 --lock-gpu-clocks base
See the cli_help.md docs for more info.
Fixes#10.
Adds a mode that forces a benchmark to only run once, simplifying
profiling usecases. This can be enabled by any of the following methods:
* Passing `--run-once` on the command line
* `NVBENCH_CREATE(...).set_run_once(true)` when declaring a benchmark
* `state.set_run_once(true)` from within the benchmark implementation.
Human-readable outputs (md) and CLI inputs still use percentages.
In-memory and machine-readable outputs (csv, json) use ratios.
This is the convention that spreadsheet apps expect. Fixes#2.