mirror of
https://github.com/NVIDIA/nvbench.git
synced 2026-05-13 09:45:39 +00:00
Add arbitrary BenchResult metadata and explicit parse control, replacing
the previous code/elapsed fields. Make BenchResult subscriptable by
subbenchmark name and make SubBenchResult list-like over its states.
Extend SubBenchState parsing to expose summaries by tag, read paired
sample frequency data, return None for unavailable sample/frequency
files, and validate matching sample/frequency lengths.
Harden parsing for NVBench JSON output with no-axis benchmarks, null
axis_values, skipped states with null summaries, float axis input_string
lookups, and recorded sidecar binary paths.
Expand BenchResult tests to cover metadata, parse=False, sequence-style
access, frequency-aware centers, missing binary data, skipped states,
and mismatched sample/frequency counts.
Example usage:
```
import array, numpy as np, cuda.bench
r = cuda.bench.BenchResult("perf_data/axes_run1.json")
r["copy_sweep_grid_shape"].centers_with_frequencies(
lambda t, f: np.median(np.asarray(t)*np.asarray(f)))
```
CUDA Kernel Benchmarking Package
This package provides a Python API to the CUDA Kernel Benchmarking
Library NVBench.
Installation
Install from PyPi
pip install cuda-bench[cu13] # For CUDA 13.x
pip install cuda-bench[cu12] # For CUDA 12.x
Building from source
Ensure recent version of CMake
Since nvbench requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using
conda create -n build_env --yes cmake ninja
conda activate build_env
Ensure CUDA compiler
Since building NVBench library requires CUDA compiler, ensure that appropriate environment variables
are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:
export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDAARCHS=86
Build Python project
Now switch to python folder, configure and install NVBench library, and install the package in editable mode:
cd nvbench/python
pip install -e .
Verify that package works
python test/run_1.py
Run examples
# Example benchmarking numba.cuda kernel
python examples/throughput.py
# Example benchmarking kernels authored using cuda.core
python examples/axes.py
# Example benchmarking algorithms from cuda.cccl.parallel
python examples/cccl_parallel_segmented_reduce.py
# Example benchmarking CuPy function
python examples/cupy_extract.py