Oleksandr Pavlyk
e589518376
Change test and examples from using camelCase to using snake_case as implementation changed
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
81fff085b9
Change method nameing from camelCase to snake_case
...
This ensures names of Python API methods are consistent with those of C++
counterparts.
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
11ae98389d
Replace use of py::object copy constructor with use of move constructor
...
Change explicit constructor of benchmark_wrapper_t to use move-constructor
of py::object instead of copy constructor by replacing `py::object(o)` with
`py::object(std::move(o))`.
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
d3071fb038
Addressed PR feedback re: definition of benchmark_wrapper_t
...
See https://github.com/NVIDIA/nvbench/pull/237#discussion_r2183749750
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
c960ef75cc
Add examples/cpu_only.py based on code from PR feedback
...
https://github.com/NVIDIA/nvbench/pull/237#issuecomment-3058594793
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
6b4da8c5cb
add comments to body of launcher_fn lambda in State.exec method
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
aa2b4d9960
Add Benchmark.setIsCPUOnly API
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
7f9d672cec
Raise Python exception if error is encountered while executing benchmarks
...
Introduce new exception type to raise on errors that occurred while
NVBench runs benchmarks.
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
8c112d529f
Include Pybind11 headers before anything else
...
See https://github.com/NVIDIA/nvbench/pull/237#discussion_r2183703828
for the rationale
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
6b1b2f3c30
Updated readme
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
203ef2046e
Add warm-up call to auto_throughput.py
...
Add throughput.py example, which is based on the same kernel as
auto_throughput.py but records global memory reads/writes amounts
to output BWUtil metric measuring %SOL in bandwidth utilization.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
02ad6e5490
Implement Benchmark.setName
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
8589511f61
Corrected broken cccl_parallel_segmented_reduce.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
394324023f
Add example for benchmarking CuPy function
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
707b24ffb5
Add examples/cccl_parallel_segmented_reduce.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
883e5819b6
Use cuda.Stream.from_handle to create core.Stream from nvbench.CudaStream
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
b357af0092
Add examples/skip.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
964ec2e1bc
Add examples/exec_tag_sync.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
4f15840832
Use state.add_summary to supplement integral TypeID with meaningful type name
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
9dba866426
Add State.add_summary method
...
state.add_summary(column_name: str, value: Union[int, float, str])
This is used in examples/axes.py to map integral value from Int64Axis
to string description.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
df426a0bad
Add examples/axes.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
576c473481
Add implementation of and signature for State.getDevice
...
make batch/sync arguments of State.exec keyword-only
Provide default column_name value for State.addElementCount method,
so that it can be called state.addElementCount(count), or as
state.addElementCount(count, column_name="Descriptive Name")
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
2507bc2263
Add Python example based on C++ example/auto_throughput.cpp
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
4950a50961
Add empty py.typed to signal mypy that package has type annotations
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
c9f0785aed
Replace uses of deprecated typing.Tuple, typing.Callable, etc.
...
Also use typing.Self to encode that `Benchmark.addInt64Axis` returns
self.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
6f8bcdc774
Fixed correctness of nvbench.State.getStream() method
...
Fix run-time exception:
```
Fail: Unexpected error: RuntimeError: return_value_policy = copy, but type is non-copyable! (#define PYBIND11_DETAILED_ERROR_MESSAGES or compile in debug mode for details)
```
caused by attempt to returning move-only `nvbench::cuda_stream` class
instance using default `pybind11::return_value_policy::copy`.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
e768ce28b6
Add Python stub file for cuda.nvbench API
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
c49d718f65
Corrected nvbench.State.getBlockingKernel -> getBlockingKernelTimeout
...
Similar change for setBlockingKernelTimeout.
Corrected statement in a comment.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
c184549cda
Import and reexport symbols from _nvbench one-by-one
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
b88cc78aeb
Add license header to py_nvbench.cpp
...
Also updated comment as to why calling
`nvbench::benchmark_manager::get().initialize()` is necessary
for running all tests.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
6552ef503c
Draft of Python API for NVBench
...
The prototype is based on pybind11 to minimize boiler-plate
code needed to deal with move-only semantics of many nvbench
classes.
2025-07-28 15:37:04 -05:00