Commit Graph

654 Commits

Author SHA1 Message Date
Yunsong Wang
3a9c80d33b Updates 2025-09-29 12:38:12 -07:00
Yunsong Wang
8af0aa38d5 Updates 2025-09-29 12:22:07 -07:00
Yunsong Wang
df7abef849 Update best_practices.md 2025-09-19 13:03:48 -07:00
Yunsong Wang
b95acb80d4 Update best_practices.md 2025-09-19 12:59:20 -07:00
Yunsong Wang
54cbcd0c42 Update best_practices.md 2025-09-19 12:58:37 -07:00
Yunsong Wang
df27dcb5df Update best_practices.md 2025-09-19 12:55:40 -07:00
Yunsong Wang
0b25b91223 Update best_practices.md 2025-09-19 12:53:30 -07:00
Yunsong Wang
63f884d666 Update best_practices.md 2025-09-19 12:46:30 -07:00
Yunsong Wang
87aa856b4d Update best_practices.md 2025-09-19 12:41:37 -07:00
Yunsong Wang
be0cda88a3 Update best_practices.md 2025-09-19 12:38:44 -07:00
Yunsong Wang
af9773df09 Update best_practices.md 2025-09-19 12:37:04 -07:00
Yunsong Wang
2da0b8d6b3 Create best_practices.md 2025-09-19 12:18:35 -07:00
Oleksandr Pavlyk
b88a45f417 Merge pull request #269 from jayavenkatesh19/main
remove pynvjitlink references in examples
2025-09-17 13:54:36 -05:00
Jaya Venkatesh
0f997271f7 added numba-cuda to requirements
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
2025-09-16 14:54:08 -07:00
Jaya Venkatesh
bfa6a6c7c6 remove pynvjitlink references in examples
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
2025-09-08 16:00:19 -07:00
Allison Piper
4642df7006 Fix sccache checks when running locally. (#268) 2025-09-05 15:50:09 -04:00
Allison Piper
33a659ecd3 Add CTK 13.0 + Clang20 to CI. (#266) 2025-09-03 11:24:07 -04:00
Allison Piper
ebc1bd1795 Avoid unreachable code warning (#265) 2025-09-02 22:03:39 -04:00
Oleksandr Pavlyk
935bb0b633 Merge pull request #237 from oleksandr-pavlyk/add-pynvbench
Python package pynvbench introduced that exposes `cuda.bench` namespace. Repository provides a set of examples.
2025-08-06 12:22:55 -05:00
Oleksandr Pavlyk
b5e4b4ba31 cuda.nvbench -> cuda.bench
Per PR review suggestion:
   - `cuda.parallel`    - device-wide algorithms/Thrust
   - `cuda.cooperative` - Cooperative algorithsm/CUB
   - `cuda.bench`       - Benchmarking/NVBench
2025-08-04 13:42:43 -05:00
Oleksandr Pavlyk
c2a2acc9b6 Change float64_t arg-type for set_throttle_threshold to float32_t
The C++ method signature of set_throttle_threshold/set_trottle_recovery_delay,
which uses nvbench::float32_t
2025-08-04 12:14:52 -05:00
Oleksandr Pavlyk
584f48ac97 Remove warm-up invocations outside of launcher in examples/throughout and auto_throughput 2025-08-04 12:14:44 -05:00
Oleksandr Pavlyk
d8b0acc8d4 Export exception to nvbench namespace 2025-08-04 12:00:42 -05:00
Oleksandr Pavlyk
9dfdd8af89 Minimal test file 2025-08-04 11:59:17 -05:00
Oleksandr Pavlyk
6aff4712f8 Change permissions of test/run_1.py 2025-08-04 10:13:08 -05:00
Oleksandr Pavlyk
73e18419b2 Stub of __cuda_stream__ special method declare tuple[int, int] as return type
This is to indicate that special method always returns a pair of integers
2025-08-04 10:11:33 -05:00
Oleksandr Pavlyk
a5e0a48f80 Add test test functions for cpp/python exceptions 2025-08-04 10:09:10 -05:00
Oleksandr Pavlyk
40a2337a6b Review fix: make nvbenhch_run_error constructable
Allow `throw nvbench_run_error("Msg");` to compile.

Add comment around definition of nvbench_run_error
2025-08-04 10:09:04 -05:00
Oleksandr Pavlyk
4fc628c4d7 Python native extension to use CXX/CUDA standard of NVBench library
This fixes cryptic build failure with GNU compiler 14
2025-08-01 15:33:39 -05:00
Oleksandr Pavlyk
3fea652d16 Fix type in stub declaration for Benchmark.add_string_axis 2025-08-01 15:03:06 -05:00
Oleksandr Pavlyk
fa8dd48186 json_printer.cu changed to use write-out buffer of 4KB (#259)
* json_printer.cu changed to use write-out buffer of 4KB

The json_printer::do_process_bulk_data_float64 used to write
out one float32 value at a time. This PR introduces a buffer of 4KB
that is being filled with values until full, and then written out.

The 4KB value aligns with system memory page size and seems
appropriate for relatively small datasizes of duration measurements.

* Add explicit static cast from std::size_t to std::streamsize

The explcit cast avoids narrowing error.

* Factor out writing array out to binary file into standalone function

This function is templated based on buffer-size. The function can be
reused to also write-out frequence samples in the future.
2025-08-01 12:48:25 -07:00
Oleksandr Pavlyk
080052a564 nvbench::state::set_stopping_criterion now also sets criterion params (#257)
This change closes gh-255 by alignign implementation of
state::set_stopping_criterion with that of
benchmark_base::set_stopping_criterion.
2025-08-01 11:46:36 -07:00
Oleksandr Pavlyk
f1fbfd85b4 Renamed src/README.md to src/.BUILD_LOCALLY.md
Provided more context to the command stated in the readme, and
changed so as to not hard-code installation paths of NVBench,
and checkout path of pybind11.
2025-07-31 16:27:54 -05:00
Oleksandr Pavlyk
453a1648aa Improvements to readability of examples per PR review 2025-07-31 16:20:52 -05:00
Oleksandr Pavlyk
c91204f259 Improved docstrings per PR review suggestions 2025-07-31 15:48:49 -05:00
Oleksandr Pavlyk
fb23591aef Fixed missing space in README 2025-07-31 15:42:30 -05:00
Oleksandr Pavlyk
f12edf722b Merge pull request #256 from oleksandr-pavlyk/fix-typo-in-comment
Fix typo: updaring->updating
2025-07-31 09:27:31 -05:00
Oleksandr Pavlyk
add539a0c1 Replaced argument type annotation: int -> typing.SupportsInt
Same for float->typing.SupportsFloat. Result types remain int/float
2025-07-30 16:54:56 -05:00
Oleksandr Pavlyk
a341c30d60 Fix typo: updaring->updating 2025-07-30 13:56:24 -05:00
Oleksandr Pavlyk
88a3ad0138 Add test/stub.py
The following static analysis run should run green

```
mypy --ignore-missing-imports test/stub.py
```
2025-07-30 13:54:37 -05:00
Oleksandr Pavlyk
9c01f229a6 Add Benchmark set methods, such as set_stopping_criterion, set_timeout, etc
Add
   - State.get_stopping_criterion() -> str
   - Benchmark.set_stopping_criterion(criterion: str) -> Self
   - Benchmark.set_criterion_param_int64(name: str, value: int) -> Self
   - Benchmark.set_criterion_param_float64(name: str, value: float) -> Self
   - Benchmark.set_criterion_param_string(name: str, value: str) -> Self
   - Benchmark.set_timeout(duration: float) -> Self
   - Benchmark.set_skip_time(skip_time: float) -> Self
   - Benchmark.set_throttle_threshold(frac: float) -> Self
   - Benchmark.set_throttle_recovery_delay(duration: float) -> Self
   - Benchmark.set_min_samples(count: int) -> Self
2025-07-30 13:37:17 -05:00
Oleksandr Pavlyk
6b9050e404 Add example of benchmarking pytorch code 2025-07-28 15:57:11 -05:00
Oleksandr Pavlyk
413c4a114b Support nvbench.State.set_throttle_threshold 2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
afb9951ed8 Enable building of NVBench as part of buildign extension
1. Download and include CPM.cmake, version 0.42.0
2. Use CPM.make to get Pybind11
3. Update to use pybind11=3.0.0
4. Also use CPM to configure/build nvbench
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
b6821b7624 Rename NVBenchRuntimeException to NVBenchRuntimeError
Added exception to __init__.pyi
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
b97e27cbf2 Add use of add_axis_values and add_axis_values_as_string to test/run_1.py 2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
eb614ac52f Add State.get_axis_values and State.get_axis_values_as_string
Add nvbench.State methods to get Python dictionary representing
axis values of benchmark configuration state represents.

get_axis_values_as_string gives a string of space-separated
name=values pairs.
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
5613281c2e nvbench.State.exec validates arg to be a callable
Add names to method arguments to make it more self-descriptive.
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
c747a19b98 Remove code setting up CUDA_MODULE_LOADING=EAGER in Python extension 2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
985db4f144 Add examples/cccl_cooperative_block_reduce.py 2025-07-28 15:37:05 -05:00