Commit Graph

545 Commits

Author SHA1 Message Date
Oleksandr Pavlyk
d160a2bafa Replace --run-once in testing/CMakeLists.txt with --profile 2025-07-28 12:03:42 -05:00
Oleksandr Pavlyk
8416342af0 Remove mentions of --run-once and --disable-blocking-kernel from help
Text for --profile modified to be self-consistent, i.e., not to refer
to removed --run-once and --disable-blocking-kernel for explanantion
of what it does.
2025-07-28 07:55:25 -05:00
Oleksandr Pavlyk
3bb34b1b1f Remove suggestion to use --disable-blocking-kernel
The text printed when blocking kernel times out already suggests to
use --profile option.
2025-07-28 07:54:16 -05:00
Oleksandr Pavlyk
281a08a57e Remove CLI --run-once and --disable-blocking-kernel options
Removed option_parser::disable_blocking_kernel and option_parse::set_run_once
methods. Added option_parser::enable_profile method instead, which calls

```
bench.set_run_once(true);
bench.disable_blocking_kernel(true);
```
2025-07-28 07:51:50 -05:00
Bernhard Manfred Gruber
0c24f0250b Avoid cuda/std types in host compiler headers (#246)
Fixes: #245
2025-07-17 03:27:39 -07:00
pre-commit-ci[bot]
38ac5d7339 [pre-commit.ci] pre-commit autoupdate (#243) 2025-07-11 07:33:25 -04:00
Oleksandr Pavlyk
b8c664d22e Do not use blocking kernel in warmup run of measure_cold (#241)
See https://github.com/NVIDIA/nvbench/issues/240
2025-07-03 21:22:12 -07:00
Allard Hendriksen
53bf11a27d Fix axes metadata assert (#239) 2025-07-03 09:32:44 -04:00
Oleksandr Pavlyk
c463a783bb Allow kernel_generator to be stateful (#234)
In python kernel generator is a user-defined callable.
We need to capture Python object of that callable in
kernel generator provided for each benchmark.

To this end, nvbench::benchmark has been modified to have member of
kernel_generator type (must be copy-constructable). Constructor acquires
an optional parameter of type `kernel_generator` with default value
of default-contstructed instance.

nvbench::runner was modified to store kernel_generator instance as well.
Its run method creates a fresh copy of stored instance for each invocation,
just as it was happening before.

nvbench tests/examples pass with this change.
2025-06-28 19:17:12 -07:00
Oleksandr Pavlyk
c2a30cf0d2 Set underlying type for enum class exec_tag to uint16_t (#233)
This change reduces size of exec_tag instance from 4 bytes to 2 bytes, it also
makes it more explicit what underlying type exec_tag is using.
2025-06-28 18:03:25 -07:00
Allison Piper
8e3e0ad117 Include RAPIDS.cmake to WAR network issues on CI. (#236)
See also https://github.com/rapidsai/rmm/pull/1886
2025-06-24 17:03:30 -04:00
Oleksandr Pavlyk
bc8319d5d9 Fix obvious typo in getter for device_manager singleton docstring (#232) 2025-06-13 10:03:54 -04:00
Oleksandr Pavlyk
b1551d2eb7 Update json and fmt projects to latest versions (#229) 2025-05-27 12:49:35 -04:00
Allison Piper
26f52a7175 Add cupti paths to INSTALL_RPATH. (#230) 2025-05-22 12:56:22 -04:00
Allison Piper
b62c0d9d78 Update youtube link URL. (#226) 2025-05-10 10:11:58 -04:00
Allison Piper
c5b8b3b494 Link GPU Mode talk talk from README. (#224) 2025-05-09 16:19:02 -04:00
Allison Piper
f44f5cc22c Remove min-time/max-noise API. (#223)
These are now owned by the stdrel stopping criterion, and should not be exposed directly in the benchmark/state/etc APIs.

This will affect users that are calling
`NVBENCH_BENCH(...).set_min_time(...)` or
`NVBENCH_BENCH(...).set_max_noise(...)`.

These can be updated to
`NVBENCH_BENCH(...).set_criterion_param_float64(["min-time"|"max-noise"], ...)`.
2025-05-08 10:02:54 -04:00
Allison Piper
a36e15f6ca Fix issues with default stopping params. (#221) 2025-05-07 11:01:36 -04:00
Allison Piper
249a74f73b Bump CI to CTK 12.9, regen devcontainers. (#219) 2025-05-02 12:05:50 -04:00
Allison Piper
9d189280de Fix get_config_count for CPU-only benchmarks. (#218) 2025-05-01 12:34:35 -04:00
Sergey Pavlov
433376fd83 Restrict stopping criterion parameter usage in command line (#174)
* restrict stopping criterion parameter usage in command line
* Update docs for stopping criterion.
* Add convenience benchmark_base API for criterion params.
* Add more test cases for stopping criterion parsing.

---------

Co-authored-by: Sergey Pavlov <psvvsp89@gmail.com>
Co-authored-by: Allison Piper <alliepiper16@gmail.com>
2025-04-30 15:53:45 -04:00
Elias Stehle
ca0e795b46 Merge pull request #113 from elstehle/fix/per-device-stream
Fixes cudaErrorInvalidValue when running on nvbench-created cuda stream
2025-04-30 15:40:33 -04:00
Allison Piper
4879607c70 Merge pull request #216 from alliepiper/disable_throttle_for_sync
Disable throttling when `sync` exec tag is used.
2025-04-24 19:02:39 -04:00
Allison Piper
e4057575c7 Disable throttling when sync exec tag is used. 2025-04-24 22:48:35 +00:00
Allison Piper
0573ffa9bd Merge pull request #214 from PointKernel/fix-throttle-setters
Fix throttle setter return values and update customization example
2025-04-24 13:53:20 -04:00
Yunsong Wang
dbd12f61b8 Revert example change 2025-04-24 10:12:46 -07:00
Allison Piper
2938a94d49 Merge pull request #215 from alliepiper/dynamic_throttle_delay
Dynamically increase recovery delay for consecutive discards.
2025-04-24 10:32:45 -04:00
Allison Piper
d12614b5cb Dynamically increase recovery delay for consecutive discards. 2025-04-24 14:11:31 +00:00
Yunsong Wang
797f91bc7e Update example to show to customize throttle threshold 2025-04-23 14:10:16 -07:00
Yunsong Wang
31efce1ec8 Fix throttle setters 2025-04-23 14:01:56 -07:00
Allison Piper
89bec09b82 Merge pull request #207 from alliepiper/throttle_followup
Throttling followup
2025-04-18 08:48:41 -04:00
Allison Piper
46ab283d02 Merge pull request #213 from alliepiper/version_prefix_fix
Use the new(ish) PREFIX option of rapids-cmake version, git revision header utils.
2025-04-15 17:17:42 -04:00
Allison Piper
109449438b Use the new(ish) PREFIX option of rapids-cmake version, git revision header utils.
Generate macros prefixed with NVBENCH instead of redefining them from NVBench.
2025-04-15 20:33:06 +00:00
Allison Piper
eadb913322 Merge pull request #211 from alliepiper/clock_api
Fetch clock rates using cudaDeviceGetAttribute.
2025-04-14 17:12:42 -04:00
Allison Piper
0c56311174 Fetch clock rates using cudaDeviceGetAttribute. 2025-04-14 16:59:54 -04:00
Allison Piper
9bf5e987cf Merge branch 'main' into throttle_followup 2025-04-14 15:29:44 -04:00
Allison Piper
33fc77aabc Merge pull request #210 from alliepiper/vdc_update
Update verify-devcontainers workflow to match CCCL.
2025-04-14 14:50:20 -04:00
Allison Piper
457b9f1064 Update verify-devcontainers workflow to match CCCL.
This prevents us from spawning a ton of jobs unless the devcontainers actually change.
2025-04-14 14:37:40 -04:00
Allison Piper
965a80f730 Formatting. 2025-04-14 18:07:27 +00:00
Allison Piper
931888116c Merge branch 'main' into throttle_followup 2025-04-14 14:06:39 -04:00
Allison Piper
2c2f40a659 Merge pull request #209 from alliepiper/pre-commit-ci
Add pre-commit.ci configs, format.
2025-04-14 14:05:48 -04:00
Allison Piper
47bd2838da Remove stale devcontainer. 2025-04-14 17:50:18 +00:00
Allison Piper
4c38b2d5f7 Clang-format doesn't like the 1'000'000 separators. 2025-04-14 17:44:31 +00:00
Allison Piper
a3a2337e04 Merge pull request #208 from alliepiper/drop-support-for-11.8
Remove coverage for 11.8.
2025-04-14 13:41:54 -04:00
Allison Piper
8cefac8463 Update blame-ignore file. 2025-04-14 17:31:13 +00:00
Allison Piper
3440855dbd Formatting updates. 2025-04-14 17:26:12 +00:00
Allison Piper
de36f1a248 Add pre-commit.ci configs. 2025-04-14 12:23:44 -04:00
Allison Piper
b89c36a5c2 Remove coverage for 11.8.
We're going to be dropping these devcontainers soon in CCCL, and they're causing issues with our pre-commit hooks.
2025-04-14 16:03:49 +00:00
Allison Piper
7d5f04ec02 Show SM clock info in summaries example. 2025-04-14 11:37:48 -04:00
Allison Piper
f2011f2281 Add new hidden summary with percent sm clock scaling, 2025-04-14 11:37:20 -04:00