mirror of
https://github.com/NVIDIA/nvbench.git
synced 2026-03-14 20:27:24 +00:00
72 lines
1.5 KiB
Markdown
72 lines
1.5 KiB
Markdown
# CUDA Kernel Benchmarking Package
|
|
|
|
This package provides a Python API to the CUDA Kernel Benchmarking
|
|
Library `NVBench`.
|
|
|
|
## Installation
|
|
|
|
Install from PyPi
|
|
|
|
```bash
|
|
pip install cuda-bench[cu13] # For CUDA 13.x
|
|
pip install cuda-bench[cu12] # For CUDA 12.x
|
|
```
|
|
|
|
## Building from source
|
|
|
|
### Ensure recent version of CMake
|
|
|
|
Since `nvbench` requires a rather new version of CMake (>=3.30.4), either build CMake from sources, or create a conda environment with a recent version of CMake, using
|
|
|
|
```
|
|
conda create -n build_env --yes cmake ninja
|
|
conda activate build_env
|
|
```
|
|
|
|
### Ensure CUDA compiler
|
|
|
|
Since building `NVBench` library requires CUDA compiler, ensure that appropriate environment variables
|
|
are set. For example, assuming CUDA toolkit is installed system-wide, and assuming Ampere GPU architecture:
|
|
|
|
```bash
|
|
export CUDACXX=/usr/local/cuda/bin/nvcc
|
|
export CUDAARCHS=86
|
|
```
|
|
|
|
### Build Python project
|
|
|
|
Now switch to python folder, configure and install NVBench library, and install the package in editable mode:
|
|
|
|
```bash
|
|
cd nvbench/python
|
|
pip install -e .
|
|
```
|
|
|
|
### Verify that package works
|
|
|
|
```bash
|
|
python test/run_1.py
|
|
```
|
|
|
|
### Run examples
|
|
|
|
```bash
|
|
# Example benchmarking numba.cuda kernel
|
|
python examples/throughput.py
|
|
```
|
|
|
|
```bash
|
|
# Example benchmarking kernels authored using cuda.core
|
|
python examples/axes.py
|
|
```
|
|
|
|
```bash
|
|
# Example benchmarking algorithms from cuda.cccl.parallel
|
|
python examples/cccl_parallel_segmented_reduce.py
|
|
```
|
|
|
|
```bash
|
|
# Example benchmarking CuPy function
|
|
python examples/cupy_extract.py
|
|
```
|