Commit Graph

295 Commits

Author SHA1 Message Date
Allison Vacanti
288b1564e0 Suppress warnings on MSVC Debug builds.
Also moved the config.cuh.in template into the source directory where
it'll be easier to find.
2021-12-21 19:35:23 -05:00
Allison Vacanti
edf2018fd7 Merge pull request #58 from allisonvacanti/nvbench_executable
Add an `nvbench-ctl` executable.
2021-12-21 12:08:39 -05:00
Allison Vacanti
20522c807d Add an nvbench-ctl executable.
This will provide functionality such as clock locking (--lgm),
persistance mode (--pm), device querying (--list), version checking
(--version), and documentation (--help).

This is possible already with any nvbench executable, but having
one with a reliable name will be helpful for scripting and writing
documentation.
2021-12-21 12:02:07 -05:00
Allison Vacanti
986736aa09 Merge pull request #60 from allisonvacanti/59_ubuntu_cupti
Add cupti path for ubuntu packages.
2021-12-20 14:35:27 -05:00
Allison Vacanti
61d094abf1 Add cupti path for ubuntu packages.
Fixes #59
2021-12-20 14:34:12 -05:00
Allison Vacanti
ff1ad78cfa Merge pull request #48 from robertmaynard/improve_compare_script_features
nvbench_compare handles directories and can filter out non-interesting results
2021-12-20 13:46:24 -05:00
Robert Maynard
6c1f372c45 Allow nvbench [-flags] (files|dirs) 2021-12-20 13:31:32 -05:00
Robert Maynard
35dd8de2ce Remove unneeded scripts/requirements.txt 2021-12-20 13:24:24 -05:00
Allison Vacanti
a8422197a9 Merge pull request #57 from senior-zero/fix_option_parser
Fix UB in option parser
2021-12-20 11:58:51 -05:00
Allison Vacanti
113b2f3f7f Merge pull request #56 from allisonvacanti/pow2_axis_compact_md
Reduce the width of pow2 axes in markdown tables.
2021-12-20 11:45:44 -05:00
Allison Vacanti
610b7767b5 Merge pull request #54 from allisonvacanti/progress_display
Print progress in markdown log.
2021-12-20 11:44:50 -05:00
Allison Vacanti
51efc7d1a8 Merge pull request #53 from allisonvacanti/50_warning_flags
Enable extra warning flags
2021-12-20 11:44:17 -05:00
Georgy Evtushenko
3bd37d0e75 Fix UB in option parser 2021-12-20 15:25:39 +03:00
Allison Vacanti
84f930809f Reduce the width of pow2 axes in markdown tables.
Before:

```
| BlockSize | (BlockSize) | NumBlocks | (NumBlocks) |
|-----------|-------------|-----------|-------------|
|       2^6 |          64 |       2^6 |          64 |
|       2^8 |         256 |       2^6 |          64 |
|      2^10 |        1024 |       2^6 |          64 |
|       2^6 |          64 |       2^8 |         256 |
|       2^8 |         256 |       2^8 |         256 |
|      2^10 |        1024 |       2^8 |         256 |
|       2^6 |          64 |      2^10 |        1024 |
|       2^8 |         256 |      2^10 |        1024 |
|      2^10 |        1024 |      2^10 |        1024 |
```

After:

```
|  BlockSize  |  NumBlocks  |
|-------------|-------------|
|    2^6 = 64 |    2^6 = 64 |
|   2^8 = 256 |    2^6 = 64 |
| 2^10 = 1024 |    2^6 = 64 |
|    2^6 = 64 |   2^8 = 256 |
|   2^8 = 256 |   2^8 = 256 |
| 2^10 = 1024 |   2^8 = 256 |
|    2^6 = 64 | 2^10 = 1024 |
|   2^8 = 256 | 2^10 = 1024 |
| 2^10 = 1024 | 2^10 = 1024 |
```
2021-12-19 10:38:14 -05:00
Allison Vacanti
37dd61b275 Clean up some virtual interfaces.
- nvbench::benchmark doesn't add state, no need to override the destructor.
- nvbench::printer_base's virtual API should support decoration, not just
  overriding. Making the virtual API protected instead of private allows
  derived classes to extend base class behavior.
- nvbench::printer_base needs a virtual destructor.
- Fix a bug in nvbench::printer_multiplex that caused the new
  `get_[total|completed]_state_count()` methods to always return 0.
2021-12-19 10:26:40 -05:00
Allison Vacanti
3508775d71 Print progress in markdown log.
e.g.

```
Run:  [1/63] copy_type_sweep [Device=0 T=U8]
Pass: Cold: 10.659315ms GPU, 10.670530ms CPU, 0.11s total GPU, 10x
Pass: Batch: 10.298826ms GPU, 0.51s total GPU, 50x
Run:  [2/63] copy_type_sweep [Device=0 T=U16]
Pass: Cold: 6.185874ms GPU, 6.194119ms CPU, 0.10s total GPU, 16x
Pass: Batch: 6.174837ms GPU, 0.53s total GPU, 86x
Run:  [3/63] copy_type_sweep [Device=0 T=U32]
...
Run:  [63/63] copy_sweep_grid_shape [Device=0 BlockSize=2^10 NumBlocks=2^10]
Pass: Cold: 4.921733ms GPU, 4.929724ms CPU, 0.10s total GPU, 21x
Pass: Batch: 4.917333ms GPU, 0.53s total GPU, 107x
```
2021-12-19 03:07:17 -05:00
Allison Vacanti
5d70492714 Enable more warning flags.
- /W4 on MSVC
- -Wall -Wextra + others on gcc/clang
- New NVBench_ENABLE_WERROR option to toggle "warnings as errors"
- Mark the nlohmann_json library as IMPORTED to switch to system includes
- Rename nvbench_main -> nvbench.main to follow target name conventions
- Explicitly suppress some cudafe warnings when compiling templates in
  nlohmann_json headers.
- Explicitly suppress some warnings from Thrust headers.
- Various fixes for warnings exposed by new flags.
- Disable CUPTI on CTK < 11.3 (See #52).
2021-12-18 20:13:25 -05:00
Allison Vacanti
15edfe2eee Refactor to use NVBENCH_THROW where possible. 2021-12-18 17:52:39 -05:00
Allison Vacanti
9ff857ee29 Merge pull request #49 from senior-zero/fix_markdown_table
Fix markdown table
2021-12-18 10:33:11 -05:00
Georgy Evtushenko
eb29ab27ff Fix markdown table 2021-12-18 18:08:29 +03:00
Georgy Evtushenko
21ea12cd10 Merge pull request #29 from senior-zero/main-feature/github/cupti
CUPTI support
2021-12-18 12:09:25 +03:00
Georgy Evtushenko
1bc715267c CUPTI support 2021-12-18 12:03:52 +03:00
Allison Vacanti
3d6c16f8ba Maintain iterator state in markdown table printer. 2021-12-18 01:27:38 -05:00
Allison Vacanti
07e1c56608 Merge pull request #46 from allisonvacanti/nvml
Add NVML support for persistence mode, locking clocks.
2021-12-17 16:07:44 -05:00
Allison Vacanti
b948e79cab Add NVML support for persistence mode, locking clocks.
Locking clocks is currently only implemented for Volta+ devices.

Example usage:

my_bench -d [0,1,3] --persistence-mode 1 --lock-gpu-clocks base

See the cli_help.md docs for more info.
2021-12-17 13:59:43 -05:00
Robert Maynard
f9b44378bf nvbench_compare now supports comparing directories of results 2021-12-16 16:26:13 -05:00
Robert Maynard
905f84272e Add --threshold-diff command option to nvbench_compare
Allows us to filter output to only see the significantly different
benchmarks
2021-12-16 15:52:30 -05:00
Robert Maynard
52d9aed8da refactor to have a proper main entry point 2021-12-16 15:27:51 -05:00
Robert Maynard
3f6d496824 Add a requirements.txt for the nv_bench script 2021-12-16 13:44:40 -05:00
Allison Vacanti
d0c90ff920 Build static fmtlib with -fPIC. 2021-12-15 12:54:53 -05:00
Allison Vacanti
af03585543 Add coloring to markdown tables. 2021-12-14 23:03:14 -05:00
Allison Vacanti
8d77dc2b6c Merge pull request #47 from allisonvacanti/base-two-bandwidth
Use base2 format for displaying bandwidth.
2021-12-14 21:22:50 -05:00
Allison Vacanti
54fda533e1 Use base2 format for displaying bandwidth.
Fixes #4.
2021-12-14 21:19:10 -05:00
Allison Vacanti
7c740975dd Force fmt to build static libs.
Otherwise it shows up in our export set when a parent project enables
BUILD_SHARED_LIBS
2021-10-28 12:39:14 -04:00
Allison Vacanti
cda8d320cb Merge pull request #44 from allisonvacanti/fix_for_conda
Don't explicitly link with cudart.
2021-10-27 12:17:09 -04:00
Allison Vacanti
f984efdc26 Don't explicitly link with cudart.
This is implicitly added by nvcc, and the explicit setting was breaking
environments where cudart_static is unavailable, e.g. conda.
2021-10-27 12:13:32 -04:00
Allison Vacanti
611385b047 Print version info with --help. 2021-10-26 17:45:33 -04:00
Allison Vacanti
1875d9962d Document new --version option. 2021-10-26 17:45:20 -04:00
Allison Vacanti
e6b5f51f1c Merge pull request #42 from allisonvacanti/rapids-cmake
Port to rapids-cmake
2021-10-26 17:26:08 -04:00
Allison Vacanti
b2d37c21fd Add export tests. 2021-10-20 14:02:16 -04:00
Allison Vacanti
27b23eeb46 Add new --version option to benchmark executables. 2021-10-20 14:02:16 -04:00
Allison Vacanti
ef36d3a558 Port to rapids-cmake.
- Add export sets
- Add install rules
- Remove manual CPM import, port to rapids_cpm_*, etc
- Organize CMake code into cmake/*.cmake files.
- NVBench is now a shared library.
2021-10-20 14:02:16 -04:00
Allison Vacanti
ed27365a41 Disable portion of test due to GCC 7 bug.
Fixes #39.
old-cmake
2021-10-19 12:26:02 -04:00
Allison Vacanti
72f9cd8adb Merge pull request #37 from allisonvacanti/fix-axis
Revert .cu -> .cxx change for option_parser TU.
2021-10-13 17:36:36 -04:00
Allison Vacanti
e11c6961b2 Revert .cu -> .cxx change for option_parser TU.
This change introduced a strange bug on GCC where a stack object is changed
seemingly randomly.

In `option_parser::update_axis`, the `flag` variable's `data()` pointer
is overwritten to point at garbage memory shortly after parsing. Nothing
related to this object is being updated when the corruption occurs.

This has been observed on gcc7+, but cannot be reproduced on MSVC.

Reverting the `option_parser.cxx` TU to be a CUDA object works
around this.

Fixes issue #36.
2021-10-13 17:36:01 -04:00
Jake Hemstad
2b8ef7442b Merge pull request #34 from jrhemstad/add_contributor_guide
Update README with info on examples/tests
2021-10-08 14:47:13 -05:00
Jake Hemstad
83a021181d Missing space. 2021-10-08 14:30:38 -05:00
Jake Hemstad
156967f6c3 Format. 2021-10-08 14:28:33 -05:00
Jake Hemstad
4f0c72b6bf Delete extra line. 2021-10-08 14:28:23 -05:00
Jake Hemstad
add1a34a04 Add sections for building examples and tests. 2021-10-08 13:03:00 -05:00