mirror of
https://github.com/NVIDIA/nvbench.git
synced 2026-03-14 20:27:24 +00:00
Add - State.get_stopping_criterion() -> str - Benchmark.set_stopping_criterion(criterion: str) -> Self - Benchmark.set_criterion_param_int64(name: str, value: int) -> Self - Benchmark.set_criterion_param_float64(name: str, value: float) -> Self - Benchmark.set_criterion_param_string(name: str, value: str) -> Self - Benchmark.set_timeout(duration: float) -> Self - Benchmark.set_skip_time(skip_time: float) -> Self - Benchmark.set_throttle_threshold(frac: float) -> Self - Benchmark.set_throttle_recovery_delay(duration: float) -> Self - Benchmark.set_min_samples(count: int) -> Self
CUDA Kernel Benchmarking Package
This package provides Python API to CUDA Kernel Benchmarking Library NVBench.
Building
Ensure recent version of CMake
Since nvbench requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using
conda create -n build_env --yes cmake ninja
conda activate build_env
Ensure CUDA compiler
Since building NVBench library requires CUDA compiler, ensure that appropriate environment variables
are set. For example, assuming CUDA toolkit is installedsystem-wide, and assuming Ampere GPU architecture:
export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86
``
### Build Python project
Now switch to python folder, configure and install NVBench library, and install the package in editable mode:
```bash
cd nvbench/python
pip install -e .
Verify that package works
python test/run_1.py
Run examples
# Example benchmarking numba.cuda kernel
python examples/throughput.py
# Example benchmarking kernels authored using cuda.core
python examples/axes.py
# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py
# Example benchmarking CuPy function
python examples/cupy_extract.py