Restrict stopping criterion parameter usage in command line (#174)

* restrict stopping criterion parameter usage in command line
* Update docs for stopping criterion.
* Add convenience benchmark_base API for criterion params.
* Add more test cases for stopping criterion parsing.

---------

Co-authored-by: Sergey Pavlov <psvvsp89@gmail.com>
Co-authored-by: Allison Piper <alliepiper16@gmail.com>
This commit is contained in:
Sergey Pavlov
2025-04-30 23:53:45 +04:00
committed by GitHub
parent ca0e795b46
commit 433376fd83
9 changed files with 482 additions and 88 deletions

View File

@@ -83,36 +83,6 @@
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--min-samples <count>`
* Gather at least `<count>` samples per measurement.
* Default is 10 samples.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--stopping-criterion <criterion>`
* After `--min-samples` is satisfied, use `<criterion>` to detect if enough
samples were collected.
* Only applies to Cold measurements.
* Default is stdrel (`--stopping-criterion stdrel`)
* `--min-time <seconds>`
* Accumulate at least `<seconds>` of execution time per measurement.
* Only applies to `stdrel` stopping criterion.
* Default is 0.5 seconds.
* If both GPU and CPU times are gathered, this applies to GPU time only.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--max-noise <value>`
* Gather samples until the error in the measurement drops below `<value>`.
* Noise is specified as the percent relative standard deviation.
* Default is 0.5% (`--max-noise 0.5`)
* Only applies to `stdrel` stopping criterion.
* Only applies to Cold measurements.
* If both GPU and CPU times are gathered, this applies to GPU noise only.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--skip-time <seconds>`
* Skip a measurement when a warmup run executes in less than `<seconds>`.
* Default is -1 seconds (disabled).
@@ -123,16 +93,6 @@
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--timeout <seconds>`
* Measurements will timeout after `<seconds>` have elapsed.
* Default is 15 seconds.
* `<seconds>` is walltime, not accumulated sample time.
* If a measurement times out, the default markdown log will print a warning to
report any outstanding termination criteria (min samples, min time, max
noise).
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--throttle-threshold <value>`
* Set the GPU throttle threshold as percentage of the device's default clock rate.
* Default is 75.
@@ -166,3 +126,68 @@
* Intended for use with external profiling tools.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
## Stopping Criteria
* `--timeout <seconds>`
* Measurements will timeout after `<seconds>` have elapsed.
* Default is 15 seconds.
* `<seconds>` is walltime, not accumulated sample time.
* If a measurement times out, the default markdown log will print a warning to
report any outstanding termination criteria (min samples, min time, max
noise).
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--min-samples <count>`
* Gather at least `<count>` samples per measurement before checking any
other stopping criterion besides the timeout.
* Default is 10 samples.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--stopping-criterion <criterion>`
* After `--min-samples` is satisfied, use `<criterion>` to detect if enough
samples were collected.
* Only applies to Cold and CPU-only measurements.
* If both GPU and CPU times are gathered, GPU time is used for stopping
analysis.
* Stopping criteria provided by NVBench are:
* "stdrel": (default) Converges to a minimal relative standard deviation,
stdev / mean
* "entropy": Converges based on the cumulative entropy of all samples.
* Each stopping criterion may provide additional parameters to customize
behavior, as detailed below:
### "stdrel" Stopping Criterion Parameters
* `--min-time <seconds>`
* Accumulate at least `<seconds>` of execution time per measurement.
* Only applies to `stdrel` stopping criterion.
* Default is 0.5 seconds.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--max-noise <value>`
* Gather samples until the error in the measurement drops below `<value>`.
* Noise is specified as the percent relative standard deviation (stdev/mean).
* Default is 0.5% (`--max-noise 0.5`)
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
### "entropy" Stopping Criterion Parameters
* `--max-angle <value>`
* Maximum linear regression angle of cumulative entropy.
* Smaller values give more accurate results.
* Default is 0.048.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.
* `--min-r2 <value>`
* Minimum coefficient of determination for linear regression of cumulative
entropy.
* Larger values give more accurate results.
* Default is 0.36.
* Applies to the most recent `--benchmark`, or all benchmarks if specified
before any `--benchmark` arguments.

View File

@@ -16,7 +16,6 @@ set(example_srcs
add_custom_target(nvbench.example.all)
add_dependencies(nvbench.all nvbench.example.all)
function (nvbench_add_examples_target target_prefix cuda_std)
add_custom_target(${target_prefix}.all)
add_dependencies(nvbench.example.all ${target_prefix}.all)
@@ -29,9 +28,15 @@ function (nvbench_add_examples_target target_prefix cuda_std)
target_include_directories(${example_name} PRIVATE "${CMAKE_CURRENT_LIST_DIR}")
target_link_libraries(${example_name} PRIVATE nvbench::main)
set_target_properties(${example_name} PROPERTIES COMPILE_FEATURES cuda_std_${cuda_std})
set(example_args --timeout 0.1)
# The custom_criterion example doesn't support the --min-time argument:
if (NOT "${example_src}" STREQUAL "custom_criterion.cu")
list(APPEND example_args --min-time 1e-5)
endif()
add_test(NAME ${example_name}
COMMAND "$<TARGET_FILE:${example_name}>" --timeout 0.1 --min-time 1e-5
)
COMMAND "$<TARGET_FILE:${example_name}>" ${example_args})
# These should not deadlock. If they do, it may be that the CUDA context was created before
# setting CUDA_MODULE_LOAD=EAGER in main, see NVIDIA/nvbench#136.

View File

@@ -266,22 +266,53 @@ struct benchmark_base
return *this;
}
/// Control the stopping criterion for the measurement loop.
/// @{
[[nodiscard]] const std::string &get_stopping_criterion() const { return m_stopping_criterion; }
benchmark_base &set_stopping_criterion(std::string criterion);
/// @}
[[nodiscard]] bool has_criterion_param(const std::string &name) const
{
return m_criterion_params.has_value(name);
}
[[nodiscard]] nvbench::int64_t get_criterion_param_int64(const std::string &name) const
{
return m_criterion_params.get_int64(name);
}
benchmark_base &set_criterion_param_int64(const std::string &name, nvbench::int64_t value)
{
m_criterion_params.set_int64(name, value);
return *this;
}
[[nodiscard]] nvbench::float64_t get_criterion_param_float64(const std::string &name) const
{
return m_criterion_params.get_float64(name);
}
benchmark_base &set_criterion_param_float64(const std::string &name, nvbench::float64_t value)
{
m_criterion_params.set_float64(name, value);
return *this;
}
[[nodiscard]] std::string get_criterion_param_string(const std::string &name) const
{
return m_criterion_params.get_string(name);
}
benchmark_base &set_criterion_param_string(const std::string &name, std::string value)
{
m_criterion_params.set_string(name, std::move(value));
return *this;
}
[[nodiscard]] nvbench::criterion_params &get_criterion_params() { return m_criterion_params; }
[[nodiscard]] const nvbench::criterion_params &get_criterion_params() const
{
return m_criterion_params;
}
/// Control the stopping criterion for the measurement loop.
/// @{
[[nodiscard]] const std::string &get_stopping_criterion() const { return m_stopping_criterion; }
benchmark_base &set_stopping_criterion(std::string criterion)
{
m_stopping_criterion = std::move(criterion);
return *this;
}
/// @}
protected:
friend struct nvbench::runner_base;

View File

@@ -17,6 +17,7 @@
*/
#include <nvbench/benchmark_base.cuh>
#include <nvbench/criterion_manager.cuh>
#include <nvbench/detail/transform_reduce.cuh>
namespace nvbench
@@ -88,4 +89,11 @@ std::size_t benchmark_base::get_config_count() const
return per_device_count * m_devices.size();
}
benchmark_base &benchmark_base::set_stopping_criterion(std::string criterion)
{
m_stopping_criterion = std::move(criterion);
m_criterion_params = criterion_manager::get().get_criterion(m_stopping_criterion).get_params();
return *this;
}
} // namespace nvbench

View File

@@ -50,6 +50,9 @@ public:
using params_description = std::vector<std::pair<std::string, nvbench::named_values::type>>;
params_description get_params_description() const;
using params_map = std::unordered_map<std::string, params_description>;
params_map get_params_description_map() const;
};
/**

View File

@@ -104,4 +104,23 @@ nvbench::criterion_manager::params_description criterion_manager::get_params_des
return desc;
}
criterion_manager::params_map criterion_manager::get_params_description_map() const
{
params_map result;
for (auto &[criterion_name, criterion] : m_map)
{
params_description &desc = result[criterion_name];
nvbench::criterion_params params = criterion->get_params();
for (auto param : params.get_names())
{
nvbench::named_values::type type = params.get_type(param);
desc.emplace_back(param, type);
}
}
return result;
}
} // namespace nvbench

View File

@@ -30,6 +30,7 @@
#include <algorithm>
#include <chrono>
#include <limits>
#include <optional>
#include <thread>
namespace nvbench::detail
@@ -387,11 +388,22 @@ void measure_cold_base::generate_summaries()
if (m_max_time_exceeded)
{
const auto timeout = m_walltime_timer.get_duration();
const auto max_noise = m_criterion_params.get_float64("max-noise");
const auto min_time = m_criterion_params.get_float64("min-time");
const auto timeout = m_walltime_timer.get_duration();
if (cuda_noise > max_noise)
auto get_param = [this](std::optional<nvbench::float64_t> &param, const std::string &name) {
if (m_criterion_params.has_value(name))
{
param = m_criterion_params.get_float64(name);
}
};
std::optional<nvbench::float64_t> max_noise;
get_param(max_noise, "max-noise");
std::optional<nvbench::float64_t> min_time;
get_param(max_noise, "min-time");
if (max_noise && cuda_noise > *max_noise)
{
printer.log(nvbench::log_level::warn,
fmt::format("Current measurement timed out ({:0.2f}s) "
@@ -399,7 +411,7 @@ void measure_cold_base::generate_summaries()
"{:0.2f}%)",
timeout,
cuda_noise * 100,
max_noise * 100));
*max_noise * 100));
}
if (m_total_samples < m_min_samples)
{
@@ -410,7 +422,7 @@ void measure_cold_base::generate_summaries()
m_total_samples,
m_min_samples));
}
if (m_total_cuda_time < min_time)
if (min_time && m_total_cuda_time < *min_time)
{
printer.log(nvbench::log_level::warn,
fmt::format("Current measurement timed out ({:0.2f}s) "
@@ -418,7 +430,7 @@ void measure_cold_base::generate_summaries()
"{:0.2f}s)",
timeout,
m_total_cuda_time,
min_time));
*min_time));
}
}

View File

@@ -39,6 +39,7 @@
#include <algorithm>
#include <cassert>
#include <cstdlib>
#include <exception>
#include <fstream>
#include <iostream>
#include <iterator>
@@ -82,11 +83,35 @@ std::string_view submatch_to_sv(const sv_submatch &in)
//
// So we're stuck with materializing a std::string and calling std::stoX(). Ah
// well. At least it's not istream.
void parse(std::string_view input, nvbench::int32_t &val) { val = std::stoi(std::string(input)); }
void parse(std::string_view input, nvbench::int32_t &val)
try
{
val = std::stoi(std::string(input));
}
catch (const std::exception &)
{ // The default exception messages are not very useful on gcc...it's just "stoi".
NVBENCH_THROW(std::invalid_argument, "Failed to parse int32 value from string '{}'", input);
}
void parse(std::string_view input, nvbench::int64_t &val) { val = std::stoll(std::string(input)); }
void parse(std::string_view input, nvbench::int64_t &val)
try
{
val = std::stoll(std::string(input));
}
catch (const std::exception &)
{
NVBENCH_THROW(std::invalid_argument, "Failed to parse int64 value from string '{}'", input);
}
void parse(std::string_view input, nvbench::float64_t &val) { val = std::stod(std::string(input)); }
void parse(std::string_view input, nvbench::float64_t &val)
try
{
val = std::stod(std::string(input));
}
catch (const std::exception &)
{
NVBENCH_THROW(std::invalid_argument, "Failed to parse float64 value from string '{}'", input);
}
void parse(std::string_view input, std::string &val) { val = input; }
@@ -727,6 +752,7 @@ void option_parser::enable_run_once()
}
void option_parser::set_stopping_criterion(const std::string &criterion)
try
{
// If no active benchmark, save args as global.
if (m_benchmarks.empty())
@@ -739,6 +765,13 @@ void option_parser::set_stopping_criterion(const std::string &criterion)
benchmark_base &bench = *m_benchmarks.back();
bench.set_stopping_criterion(criterion);
}
catch (std::exception &e)
{
NVBENCH_THROW(std::runtime_error,
"Error handling option `--stopping-criterion {}`:\n{}",
criterion,
e.what());
}
void option_parser::disable_blocking_kernel()
{
@@ -983,17 +1016,39 @@ void option_parser::update_criterion_prop(const std::string &prop_arg,
const nvbench::named_values::type type)
try
{
const std::string name(prop_arg.begin() + 2, prop_arg.end());
// If no active benchmark, save args as global.
if (m_benchmarks.empty())
{
// Any global params must either belong to the default criterion or follow a
// `--stopping-criterion` arg:
nvbench::criterion_params params;
if (!params.has_value(name) &&
std::find(m_global_benchmark_args.cbegin(),
m_global_benchmark_args.cend(),
"--stopping-criterion") == m_global_benchmark_args.cend())
{
NVBENCH_THROW(std::runtime_error,
"Unrecognized stopping criterion parameter: `{}` for default criterion.",
name);
}
m_global_benchmark_args.push_back(prop_arg);
m_global_benchmark_args.push_back(prop_val);
return;
}
benchmark_base &bench = *m_benchmarks.back();
nvbench::criterion_params &criterion_params = bench.get_criterion_params();
std::string name(prop_arg.begin() + 2, prop_arg.end());
benchmark_base &bench = *m_benchmarks.back();
if (!bench.has_criterion_param(name))
{
NVBENCH_THROW(std::runtime_error,
"Unrecognized stopping criterion parameter: `{}` for `{}`.",
name,
bench.get_stopping_criterion());
}
if (type == nvbench::named_values::type::float64)
{
nvbench::float64_t value{};
@@ -1003,21 +1058,21 @@ try
{ // Specified as percentage, stored as ratio:
value /= 100.0;
}
criterion_params.set_float64(name, value);
bench.set_criterion_param_float64(name, value);
}
else if (type == nvbench::named_values::type::int64)
{
nvbench::int64_t value{};
::parse(prop_val, value);
criterion_params.set_int64(name, value);
bench.set_criterion_param_int64(name, value);
}
else if (type == nvbench::named_values::type::string)
{
criterion_params.set_string(name, prop_val);
bench.set_criterion_param_string(name, prop_val);
}
else
{
NVBENCH_THROW(std::runtime_error, "Unrecognized property: `{}`", prop_arg);
NVBENCH_THROW(std::runtime_error, "Unrecognized type for property: `{}`", name);
}
}
catch (std::exception &e)

View File

@@ -1197,26 +1197,262 @@ void test_timeout()
void test_stopping_criterion()
{
nvbench::option_parser parser;
parser.parse({"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
"--max-angle",
"0.42",
"--min-r2",
"0.6"});
const auto &states = parser_to_states(parser);
{ // Per benchmark criterion
nvbench::option_parser parser;
parser.parse({
"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
"--max-angle",
"0.42",
"--min-r2",
"0.6",
});
const auto &states = parser_to_states(parser);
ASSERT(states.size() == 1);
ASSERT(states[0].get_stopping_criterion() == "entropy");
ASSERT(states.size() == 1);
ASSERT(states[0].get_stopping_criterion() == "entropy");
const nvbench::criterion_params &criterion_params = states[0].get_criterion_params();
ASSERT(criterion_params.has_value("max-angle"));
ASSERT(criterion_params.has_value("min-r2"));
const nvbench::criterion_params &criterion_params = states[0].get_criterion_params();
ASSERT(criterion_params.has_value("max-angle"));
ASSERT(criterion_params.has_value("min-r2"));
ASSERT(criterion_params.get_float64("max-angle") == 0.42);
ASSERT(criterion_params.get_float64("min-r2") == 0.6);
ASSERT(criterion_params.get_float64("max-angle") == 0.42);
ASSERT(criterion_params.get_float64("min-r2") == 0.6);
}
{ // Global criterion
nvbench::option_parser parser;
parser.parse({
"--stopping-criterion",
"entropy",
"--max-angle",
"0.42",
"--min-r2",
"0.6",
"--benchmark",
"DummyBench",
});
const auto &states = parser_to_states(parser);
ASSERT(states.size() == 1);
ASSERT(states[0].get_stopping_criterion() == "entropy");
const nvbench::criterion_params &criterion_params = states[0].get_criterion_params();
ASSERT(criterion_params.has_value("max-angle"));
ASSERT(criterion_params.has_value("min-r2"));
ASSERT(criterion_params.get_float64("max-angle") == 0.42);
ASSERT(criterion_params.get_float64("min-r2") == 0.6);
}
{ // Global criterion, per-benchmark params
nvbench::option_parser parser;
parser.parse({
"--stopping-criterion",
"entropy",
"--benchmark",
"DummyBench",
"--max-angle",
"0.42",
"--min-r2",
"0.6",
});
const auto &states = parser_to_states(parser);
ASSERT(states.size() == 1);
ASSERT(states[0].get_stopping_criterion() == "entropy");
const nvbench::criterion_params &criterion_params = states[0].get_criterion_params();
ASSERT(criterion_params.has_value("max-angle"));
ASSERT(criterion_params.has_value("min-r2"));
ASSERT(criterion_params.get_float64("max-angle") == 0.42);
ASSERT(criterion_params.get_float64("min-r2") == 0.6);
}
{ // Global params to default criterion should work:
nvbench::option_parser parser;
parser.parse({
"--max-noise",
"0.5",
"--min-time",
"0.1",
"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
"--max-angle",
"0.42",
"--min-r2",
"0.6",
});
const auto &states = parser_to_states(parser);
ASSERT(states.size() == 1);
ASSERT(states[0].get_stopping_criterion() == "entropy");
const nvbench::criterion_params &criterion_params = states[0].get_criterion_params();
ASSERT(criterion_params.has_value("max-angle"));
ASSERT(criterion_params.has_value("min-r2"));
ASSERT(criterion_params.get_float64("max-angle") == 0.42);
ASSERT(criterion_params.get_float64("min-r2") == 0.6);
}
{ // Unknown stopping criterion should throw
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--benchmark",
"DummyBench",
"--stopping-criterion",
"I_do_not_exist",
});
}
catch (const std::runtime_error &)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // Global criterion to non-default params without global --stopping-criterion should throw
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--max-angle",
"0.42",
"--min-r2",
"0.6",
"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
});
}
catch (const std::runtime_error &)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // Invalid global param throws exception:
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--max-angle",
"0.42",
"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
"--min-r2",
"0.6",
"--max-angle",
"0.42",
"--benchmark",
"TestBench",
"--stopping-criterion",
"stdrel",
});
}
catch (const std::runtime_error & /*ex*/)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // Invalid per-bench param throws exception:
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--benchmark",
"DummyBench",
"--stopping-criterion",
"entropy",
"--min-r2",
"0.6",
"--max-angle",
"0.42",
"--benchmark",
"TestBench",
"--stopping-criterion",
"stdrel",
"--max-angle",
"0.42",
});
}
catch (const std::runtime_error & /*ex*/)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // global param-before-criterion throws exception:
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--min-r2", //
"0.6",
"--stopping-criterion",
"entropy",
"--benchmark",
"DummyBench",
});
}
catch (const std::runtime_error & /*ex*/)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // per-benchmark param-before-criterion throws exception:
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--benchmark", //
"DummyBench",
"--min-r2",
"0.6",
"--stopping-criterion",
"entropy",
});
}
catch (const std::runtime_error & /*ex*/)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
{ // Invalid param type throws exception:
bool exception_thrown = false;
try
{
nvbench::option_parser parser;
parser.parse({
"--benchmark", //
"DummyBench",
"--stopping-criterion",
"entropy",
"--min-r2",
"\"foo\"",
});
}
catch (const std::runtime_error &)
{
exception_thrown = true;
}
ASSERT(exception_thrown);
}
}
int main()
@@ -1261,6 +1497,6 @@ try
}
catch (std::exception &err)
{
fmt::print(stderr, "{}", err.what());
fmt::print(stderr, "Unexpected exception:\n{}\n", err.what());
return 1;
}