Sergey Pavlov
433376fd83
Restrict stopping criterion parameter usage in command line ( #174 )
...
* restrict stopping criterion parameter usage in command line
* Update docs for stopping criterion.
* Add convenience benchmark_base API for criterion params.
* Add more test cases for stopping criterion parsing.
---------
Co-authored-by: Sergey Pavlov <psvvsp89@gmail.com >
Co-authored-by: Allison Piper <alliepiper16@gmail.com >
2025-04-30 15:53:45 -04:00
Elias Stehle
ca0e795b46
Merge pull request #113 from elstehle/fix/per-device-stream
...
Fixes cudaErrorInvalidValue when running on nvbench-created cuda stream
2025-04-30 15:40:33 -04:00
Allison Piper
4879607c70
Merge pull request #216 from alliepiper/disable_throttle_for_sync
...
Disable throttling when `sync` exec tag is used.
2025-04-24 19:02:39 -04:00
Allison Piper
e4057575c7
Disable throttling when sync exec tag is used.
2025-04-24 22:48:35 +00:00
Allison Piper
0573ffa9bd
Merge pull request #214 from PointKernel/fix-throttle-setters
...
Fix throttle setter return values and update customization example
2025-04-24 13:53:20 -04:00
Yunsong Wang
dbd12f61b8
Revert example change
2025-04-24 10:12:46 -07:00
Allison Piper
2938a94d49
Merge pull request #215 from alliepiper/dynamic_throttle_delay
...
Dynamically increase recovery delay for consecutive discards.
2025-04-24 10:32:45 -04:00
Allison Piper
d12614b5cb
Dynamically increase recovery delay for consecutive discards.
2025-04-24 14:11:31 +00:00
Yunsong Wang
797f91bc7e
Update example to show to customize throttle threshold
2025-04-23 14:10:16 -07:00
Yunsong Wang
31efce1ec8
Fix throttle setters
2025-04-23 14:01:56 -07:00
Allison Piper
89bec09b82
Merge pull request #207 from alliepiper/throttle_followup
...
Throttling followup
2025-04-18 08:48:41 -04:00
Allison Piper
46ab283d02
Merge pull request #213 from alliepiper/version_prefix_fix
...
Use the new(ish) PREFIX option of rapids-cmake version, git revision header utils.
2025-04-15 17:17:42 -04:00
Allison Piper
109449438b
Use the new(ish) PREFIX option of rapids-cmake version, git revision header utils.
...
Generate macros prefixed with NVBENCH instead of redefining them from NVBench.
2025-04-15 20:33:06 +00:00
Allison Piper
eadb913322
Merge pull request #211 from alliepiper/clock_api
...
Fetch clock rates using cudaDeviceGetAttribute.
2025-04-14 17:12:42 -04:00
Allison Piper
0c56311174
Fetch clock rates using cudaDeviceGetAttribute.
2025-04-14 16:59:54 -04:00
Allison Piper
9bf5e987cf
Merge branch 'main' into throttle_followup
2025-04-14 15:29:44 -04:00
Allison Piper
33fc77aabc
Merge pull request #210 from alliepiper/vdc_update
...
Update verify-devcontainers workflow to match CCCL.
2025-04-14 14:50:20 -04:00
Allison Piper
457b9f1064
Update verify-devcontainers workflow to match CCCL.
...
This prevents us from spawning a ton of jobs unless the devcontainers actually change.
2025-04-14 14:37:40 -04:00
Allison Piper
965a80f730
Formatting.
2025-04-14 18:07:27 +00:00
Allison Piper
931888116c
Merge branch 'main' into throttle_followup
2025-04-14 14:06:39 -04:00
Allison Piper
2c2f40a659
Merge pull request #209 from alliepiper/pre-commit-ci
...
Add pre-commit.ci configs, format.
2025-04-14 14:05:48 -04:00
Allison Piper
47bd2838da
Remove stale devcontainer.
2025-04-14 17:50:18 +00:00
Allison Piper
4c38b2d5f7
Clang-format doesn't like the 1'000'000 separators.
2025-04-14 17:44:31 +00:00
Allison Piper
a3a2337e04
Merge pull request #208 from alliepiper/drop-support-for-11.8
...
Remove coverage for 11.8.
2025-04-14 13:41:54 -04:00
Allison Piper
8cefac8463
Update blame-ignore file.
2025-04-14 17:31:13 +00:00
Allison Piper
3440855dbd
Formatting updates.
2025-04-14 17:26:12 +00:00
Allison Piper
de36f1a248
Add pre-commit.ci configs.
2025-04-14 12:23:44 -04:00
Allison Piper
b89c36a5c2
Remove coverage for 11.8.
...
We're going to be dropping these devcontainers soon in CCCL, and they're causing issues with our pre-commit hooks.
2025-04-14 16:03:49 +00:00
Allison Piper
7d5f04ec02
Show SM clock info in summaries example.
2025-04-14 11:37:48 -04:00
Allison Piper
f2011f2281
Add new hidden summary with percent sm clock scaling,
2025-04-14 11:37:20 -04:00
Allison Piper
e0a486b03b
Reduce memory usage of clock rate logging.
2025-04-14 11:35:27 -04:00
Allison Piper
18926ced87
Replace references to peak_sm_clock with default_sm_clock.
...
The actual measured clock speed can exceed this value, so default is less confusing than peak.
2025-04-14 11:33:04 -04:00
Allison Piper
87dd03254f
Merge pull request #206 from gevtushenko/throttle
...
Discard measurements while GPU is throttling
2025-04-14 10:57:33 -04:00
Georgy Evtushenko
254ac2517f
Remove discard on throttle option
2025-04-12 21:13:13 -07:00
Georgy Evtushenko
b926daf09f
Better throttle recovery delay
2025-04-12 21:04:12 -07:00
Georgy Evtushenko
5c0d674757
Fix overflow in default clock rate
2025-04-11 15:44:11 -07:00
Georgy Evtushenko
2ba2d1131d
Report mean SM clock rate
2025-04-11 15:33:57 -07:00
Georgy Evtushenko
f29f7ac2fb
Detect throttle
...
Signed-off-by: Georgy Evtushenko <evtushenko.georgy@gmail.com >
2025-04-11 14:35:40 -07:00
Allison Piper
36adf3a210
Merge pull request #204 from alliepiper/summaries
...
Add min/max timings, new "summaries" example.
2025-04-08 17:51:36 -04:00
Allison Piper
2ba8acd4ea
Add example that demonstrates how to add/remove columns from the markdown table.
2025-04-08 21:14:21 +00:00
Allison Piper
94fde7777c
Clean up summary code, add min/max times summaries.
2025-04-08 19:15:25 +00:00
Allison Piper
beca2c0038
Merge pull request #203 from alliepiper/exec_tag_cleanup
...
Clean up unnecessary exec_tags.
2025-04-08 13:35:34 -04:00
Allison Piper
35360614ed
Remove run_once exec_tag.
...
Similar to `no_block`, this is a runtime variable that doesn't need to be encoded statically.
It was not exposed publicly and existing solely as an implementation detail of `state::exec`, introducing unnecessary complexity there.
2025-04-08 17:15:58 +00:00
Allison Piper
851d7aadd0
Make blocking kernel use a runtime option.
...
It's not worth instantiating multiple instances of the measurement class to handle this.
Since there's already runtime option to disable the blocking kernel, the current implementation by default will instantiate both the blocking and non-blocking version of the algorithm for dynamic dispatch.
2025-04-08 17:15:58 +00:00
Allison Piper
52028be94f
Merge pull request #201 from alliepiper/cpu_only
...
Add cpu-only benchmarking support.
2025-04-08 11:39:30 -04:00
Allison Piper
a6df59a9b5
Add support for CPU-only benchmarking.
...
Fixes #95 .
CPU-only mode is enabled by setting the `is_cpu_only` property while
defining a benchmark, e.g. `NVBENCH_BENCH(foo).set_is_cpu_only(true)`.
An optional `nvbench::exec_tag::no_gpu` hint can also be passed to
`state.exec` to avoid instantiating GPU benchmarking backends. Note that
a CUDA compiler and CUDA runtime are always required, even if all benchmarks
in a translation unit are CPU-only.
Similarly, a new `nvbench::exec_tag::gpu` hint can be used to avoid
compiling CPU-only backends for GPU benchmarks.
2025-04-08 11:17:23 -04:00
Allison Piper
1efed5f8e1
Merge pull request #200 from alliepiper/update_deps
...
Update dependencies, drop support for old compilers and MSVC.
2025-04-04 18:45:59 -04:00
Allison Piper
93ea533fd3
Drop support for MSVC.
2025-04-04 22:17:03 +00:00
Allison Piper
1d0daa52ae
Add skip-vdc option to CI.
2025-04-04 17:44:33 -04:00
Allison Piper
7d210614f5
Attempt to suppress system include warnings on MSVC.
2025-04-04 17:44:33 -04:00