mirror of
https://github.com/NVIDIA/nvbench.git
synced 2026-07-01 11:47:33 +00:00
* Register the dynamically loaded nvbench_compare module in sys.modules before executing it so tests better match normal import behavior. * Add shared tabulate-capture helpers and select rendered comparison tables by header suffix instead of relying on the last tabulate call. This makes display tests robust to future summary or legend table output. * Add coverage for unusable bulk cycle data forcing an ambiguous result instead of falling back to summary clock confirmation. * Rename the TOML parser integration test to clarify that it exercises whichever parser is available in the environment, and document the Python 3.10 tomli skip behavior.
CUDA Kernel Benchmarking Package
This package provides a Python API to the CUDA Kernel Benchmarking
Library NVBench.
Installation
Install from PyPi
pip install cuda-bench[cu13] # For CUDA 13.x
pip install cuda-bench[cu12] # For CUDA 12.x
Building from source
Ensure recent version of CMake
Since nvbench requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using
conda create -n build_env --yes cmake ninja
conda activate build_env
Ensure CUDA compiler
Since building NVBench library requires CUDA compiler, ensure that appropriate environment variables
are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:
export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86
Build Python project
Now switch to python folder, configure and install NVBench library, and install the package in editable mode:
cd nvbench/python
pip install -e .
Verify that package works
python test/run_1.py
Run examples
# Example benchmarking numba.cuda kernel
python examples/throughput.py
# Example benchmarking kernels authored using cuda.core
python examples/axes.py
# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py
# Example benchmarking CuPy function
python examples/cupy_extract.py