Files
nvbench/python
Oleksandr Pavlyk ca1d60610c Use robust summaries in nvbench_compare classification
Teach nvbench_compare to parse GPU timing summaries into structured values and
prefer the robust median/IQR summaries when both compared measurements provide
them. Fall back to the existing mean/stdev summaries when robust summaries are
not available.

Classify comparisons with the larger available relative noise estimate instead
of the smaller one, keep unavailable noise distinct from encoded infinite noise,
and report improvements separately from regressions. Keep the process exit code
as success for completed comparisons; regression counts are reported in the
summary instead of being used as the process status.

Make plotting tolerate unavailable noise by leaving gaps in confidence bands,
sort plotted series by the plotted axis, and avoid reusing pyplot state across
plot calls.

Add focused Python tests for robust-summary preference, unavailable-noise
classification, non-finite timing centers, plot-along handling when the selected
axis is absent, and the exit-code contract.
2026-06-02 11:47:47 -05:00
..
2025-07-28 15:37:04 -05:00
2026-02-02 16:03:15 -06:00
2026-01-30 09:32:44 -06:00

CUDA Kernel Benchmarking Package

This package provides a Python API to the CUDA Kernel Benchmarking Library NVBench.

Installation

Install from PyPi

pip install cuda-bench[cu13]  # For CUDA 13.x
pip install cuda-bench[cu12]  # For CUDA 12.x

Building from source

Ensure recent version of CMake

Since nvbench requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using

conda create -n build_env --yes  cmake ninja
conda activate build_env

Ensure CUDA compiler

Since building NVBench library requires CUDA compiler, ensure that appropriate environment variables are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:

export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86

Build Python project

Now switch to python folder, configure and install NVBench library, and install the package in editable mode:

cd nvbench/python
pip install -e .

Verify that package works

python test/run_1.py

Run examples

# Example benchmarking numba.cuda kernel
python examples/throughput.py
# Example benchmarking kernels authored using cuda.core
python examples/axes.py
# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py
# Example benchmarking CuPy function
python examples/cupy_extract.py