Oleksandr Pavlyk
e589518376
Change test and examples from using camelCase to using snake_case as implementation changed
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
c960ef75cc
Add examples/cpu_only.py based on code from PR feedback
...
https://github.com/NVIDIA/nvbench/pull/237#issuecomment-3058594793
2025-07-28 15:37:05 -05:00
Oleksandr Pavlyk
203ef2046e
Add warm-up call to auto_throughput.py
...
Add throughput.py example, which is based on the same kernel as
auto_throughput.py but records global memory reads/writes amounts
to output BWUtil metric measuring %SOL in bandwidth utilization.
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
8589511f61
Corrected broken cccl_parallel_segmented_reduce.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
394324023f
Add example for benchmarking CuPy function
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
707b24ffb5
Add examples/cccl_parallel_segmented_reduce.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
883e5819b6
Use cuda.Stream.from_handle to create core.Stream from nvbench.CudaStream
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
b357af0092
Add examples/skip.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
964ec2e1bc
Add examples/exec_tag_sync.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
4f15840832
Use state.add_summary to supplement integral TypeID with meaningful type name
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
df426a0bad
Add examples/axes.py
2025-07-28 15:37:04 -05:00
Oleksandr Pavlyk
2507bc2263
Add Python example based on C++ example/auto_throughput.cpp
2025-07-28 15:37:04 -05:00