Merge pull request #34 from jrhemstad/add_contributor_guide

Update README with info on examples/tests
2026-04-20 14:58:54 +00:00 · 2021-10-08 14:47:13 -05:00
parent 0f926f6a3d 83a021181d
commit 2b8ef7442b
2 changed files with 83 additions and 2 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+build/
--- a/README.md
+++ b/README.md
@@ -63,12 +63,92 @@ various NVBench features and usecases:
 - [Benchmarks that sync CUDA devices: `nvbench::exec_tag::sync`](examples/exec_tag_sync.cu)
 - [Manual timing: `nvbench::exec_tag::timer`](examples/exec_tag_timer.cu)

+### Building Examples
+
+To build the examples:
+```
+mkdir -p build
+cd build
+cmake -DNVBench_ENABLE_EXAMPLES=ON -DCMAKE_CUDA_ARCHITECTURE=70 .. && make
+```
+Be sure to set `CMAKE_CUDA_ARCHITECTURE` based on the GPU you are running on. 
+
+Examples are built by default into `build/bin` and are prefixed with `nvbench.example`.
+
+<details>
+  <summary>Example output from `nvbench.example.throughput`</summary>
+
+```
+# Devices
+
+## [0] `Quadro GV100`
+* SM Version: 700 (PTX Version: 700)
+* Number of SMs: 80
+* SM Default Clock Rate: 1627 MHz
+* Global Memory: 32163 MiB Free / 32508 MiB Total
+* Global Memory Bus Peak: 870 GiB/sec (4096-bit DDR @850MHz)
+* Max Shared Memory: 96 KiB/SM, 48 KiB/Block
+* L2 Cache Size: 6144 KiB
+* Maximum Active Blocks: 32/SM
+* Maximum Active Threads: 2048/SM, 1024/Block
+* Available Registers: 65536/SM, 65536/Block
+* ECC Enabled: No
+
+# Log
+
+Run:  throughput_bench [Device=0]
+Warn: Current measurement timed out (15.00s) while over noise threshold (1.26% > 0.50%)
+Pass: Cold: 0.262392ms GPU, 0.267860ms CPU, 7.19s total GPU, 27393x
+Pass: Batch: 0.261963ms GPU, 7.18s total GPU, 27394x
+
+# Benchmark Results
+
+## throughput_bench
+
+### [0] Quadro GV100
+
+| NumElements |  DataSize  | Samples |  CPU Time  | Noise |  GPU Time  | Noise | Elem/s  | GlobalMem BW  | BWPeak | Batch GPU  | Batch  |
+|-------------|------------|---------|------------|-------|------------|-------|---------|---------------|--------|------------|--------|
+|    16777216 | 64.000 MiB |  27393x | 267.860 us | 1.25% | 262.392 us | 1.26% | 63.940G | 476.387 GiB/s | 58.77% | 261.963 us | 27394x |
+```
+
+</details>
+
+
+## Demo Project
+
 To get started using NVBench with your own kernels, consider trying out
-the [NVBench Demo Project](https://github.com/allisonvacanti/nvbench_demo)
-. `nvbench_demo` provides a simple CMake project that uses NVBench to build an
+the [NVBench Demo Project](https://github.com/allisonvacanti/nvbench_demo). 
+
+`nvbench_demo` provides a simple CMake project that uses NVBench to build an
 example benchmark. It's a great way to experiment with the library without a lot
 of investment.

+# Contributing
+
+Contributions are welcome!
+
+For current issues, see the [issue board](https://github.com/NVIDIA/nvbench/issues). Issues labeled with [![](https://img.shields.io/github/labels/NVIDIA/nvbench/good%20first%20issue)](https://github.com/NVIDIA/nvbench/labels/good%20first%20issue) are good for first time contributors. 
+
+## Tests
+
+To build `nvbench` tests:
+```
+mkdir -p build
+cd build
+cmake -DNVBench_ENABLE_TESTING=ON .. && make
+```
+
+Tests are built by default into `build/bin` and prefixed with `nvbench.test`.
+
+To run all tests:
+```
+make test
+```
+or 
+```
+ctest
+```
 # License

 NVBench is released under the Apache 2.0 License with LLVM exceptions.