mirror of https://github.com/NVIDIA/nvbench.git synced 2026-07-01 11:47:33 +00:00

Files

Oleksandr Pavlyk b34dfbb348 Explicitly handle unavailable timings in nvbench-compare

Treat matched states with unusable timing data as UNKNOWN instead of
dropping them from the comparison. This includes missing, non-finite, or
non-positive timing centers, skipped states, and states with missing GPU
timing summaries.

Add explicit reason codes for these cases so the summary points users at
the underlying data issue. Preserve available timing data from the other
side when only one side is missing, and render unavailable durations as
n/a in all display modes.

Also sort values returned by np.unique_counts before nearest-neighbor
coverage checks so the distance algorithm receives ordered inputs.

Add regression coverage for UNKNOWN counting, skipped states, missing
summaries, unavailable center formatting, and the updated coverage helper.

2026-06-30 06:40:44 -05:00

cuda/bench

Fix docutil error when building docs (#365 )

2026-05-18 10:57:19 -05:00

examples

Implement Timer, and support State.exec(fn, timer=True) (#364 )

2026-05-15 10:19:40 -05:00

scripts

Explicitly handle unavailable timings in nvbench-compare

2026-06-30 06:40:44 -05:00

src

Add python api for cold warmup parameters (#363 )

2026-05-18 10:56:44 -05:00

test

Explicitly handle unavailable timings in nvbench-compare

2026-06-30 06:40:44 -05:00

.gitignore

Draft of Python API for NVBench

2025-07-28 15:37:04 -05:00

CMakeLists.txt

Disable CUPTI in cmake file

2026-02-02 16:03:15 -06:00

pyproject.toml

Provide BenchmarkResult class for parsing JSON output of NVBench-instrumented benchmarks (#356 )

2026-05-13 13:23:58 -05:00

README.md

Add installation instructions

2026-01-30 09:32:44 -06:00

README.md

CUDA Kernel Benchmarking Package

This package provides a Python API to the CUDA Kernel Benchmarking Library NVBench.

Installation

Install from PyPi

pip install cuda-bench[cu13]  # For CUDA 13.x
pip install cuda-bench[cu12]  # For CUDA 12.x

Building from source

Ensure recent version of CMake

Since nvbench requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using

conda create -n build_env --yes  cmake ninja
conda activate build_env

Ensure CUDA compiler

Since building NVBench library requires CUDA compiler, ensure that appropriate environment variables are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:

export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86

Build Python project

Now switch to python folder, configure and install NVBench library, and install the package in editable mode:

cd nvbench/python
pip install -e .

Verify that package works

python test/run_1.py

Run examples

# Example benchmarking numba.cuda kernel
python examples/throughput.py

# Example benchmarking kernels authored using cuda.core
python examples/axes.py

# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py

# Example benchmarking CuPy function
python examples/cupy_extract.py