Allison Vacanti
edf2018fd7
Merge pull request #58 from allisonvacanti/nvbench_executable
...
Add an `nvbench-ctl` executable.
2021-12-21 12:08:39 -05:00
Allison Vacanti
20522c807d
Add an nvbench-ctl executable.
...
This will provide functionality such as clock locking (--lgm),
persistance mode (--pm), device querying (--list), version checking
(--version), and documentation (--help).
This is possible already with any nvbench executable, but having
one with a reliable name will be helpful for scripting and writing
documentation.
2021-12-21 12:02:07 -05:00
Allison Vacanti
986736aa09
Merge pull request #60 from allisonvacanti/59_ubuntu_cupti
...
Add cupti path for ubuntu packages.
2021-12-20 14:35:27 -05:00
Allison Vacanti
61d094abf1
Add cupti path for ubuntu packages.
...
Fixes #59
2021-12-20 14:34:12 -05:00
Allison Vacanti
ff1ad78cfa
Merge pull request #48 from robertmaynard/improve_compare_script_features
...
nvbench_compare handles directories and can filter out non-interesting results
2021-12-20 13:46:24 -05:00
Robert Maynard
6c1f372c45
Allow nvbench [-flags] (files|dirs)
2021-12-20 13:31:32 -05:00
Robert Maynard
35dd8de2ce
Remove unneeded scripts/requirements.txt
2021-12-20 13:24:24 -05:00
Allison Vacanti
a8422197a9
Merge pull request #57 from senior-zero/fix_option_parser
...
Fix UB in option parser
2021-12-20 11:58:51 -05:00
Allison Vacanti
113b2f3f7f
Merge pull request #56 from allisonvacanti/pow2_axis_compact_md
...
Reduce the width of pow2 axes in markdown tables.
2021-12-20 11:45:44 -05:00
Allison Vacanti
610b7767b5
Merge pull request #54 from allisonvacanti/progress_display
...
Print progress in markdown log.
2021-12-20 11:44:50 -05:00
Allison Vacanti
51efc7d1a8
Merge pull request #53 from allisonvacanti/50_warning_flags
...
Enable extra warning flags
2021-12-20 11:44:17 -05:00
Georgy Evtushenko
3bd37d0e75
Fix UB in option parser
2021-12-20 15:25:39 +03:00
Allison Vacanti
84f930809f
Reduce the width of pow2 axes in markdown tables.
...
Before:
```
| BlockSize | (BlockSize) | NumBlocks | (NumBlocks) |
|-----------|-------------|-----------|-------------|
| 2^6 | 64 | 2^6 | 64 |
| 2^8 | 256 | 2^6 | 64 |
| 2^10 | 1024 | 2^6 | 64 |
| 2^6 | 64 | 2^8 | 256 |
| 2^8 | 256 | 2^8 | 256 |
| 2^10 | 1024 | 2^8 | 256 |
| 2^6 | 64 | 2^10 | 1024 |
| 2^8 | 256 | 2^10 | 1024 |
| 2^10 | 1024 | 2^10 | 1024 |
```
After:
```
| BlockSize | NumBlocks |
|-------------|-------------|
| 2^6 = 64 | 2^6 = 64 |
| 2^8 = 256 | 2^6 = 64 |
| 2^10 = 1024 | 2^6 = 64 |
| 2^6 = 64 | 2^8 = 256 |
| 2^8 = 256 | 2^8 = 256 |
| 2^10 = 1024 | 2^8 = 256 |
| 2^6 = 64 | 2^10 = 1024 |
| 2^8 = 256 | 2^10 = 1024 |
| 2^10 = 1024 | 2^10 = 1024 |
```
2021-12-19 10:38:14 -05:00
Allison Vacanti
37dd61b275
Clean up some virtual interfaces.
...
- nvbench::benchmark doesn't add state, no need to override the destructor.
- nvbench::printer_base's virtual API should support decoration, not just
overriding. Making the virtual API protected instead of private allows
derived classes to extend base class behavior.
- nvbench::printer_base needs a virtual destructor.
- Fix a bug in nvbench::printer_multiplex that caused the new
`get_[total|completed]_state_count()` methods to always return 0.
2021-12-19 10:26:40 -05:00
Allison Vacanti
3508775d71
Print progress in markdown log.
...
e.g.
```
Run: [1/63] copy_type_sweep [Device=0 T=U8]
Pass: Cold: 10.659315ms GPU, 10.670530ms CPU, 0.11s total GPU, 10x
Pass: Batch: 10.298826ms GPU, 0.51s total GPU, 50x
Run: [2/63] copy_type_sweep [Device=0 T=U16]
Pass: Cold: 6.185874ms GPU, 6.194119ms CPU, 0.10s total GPU, 16x
Pass: Batch: 6.174837ms GPU, 0.53s total GPU, 86x
Run: [3/63] copy_type_sweep [Device=0 T=U32]
...
Run: [63/63] copy_sweep_grid_shape [Device=0 BlockSize=2^10 NumBlocks=2^10]
Pass: Cold: 4.921733ms GPU, 4.929724ms CPU, 0.10s total GPU, 21x
Pass: Batch: 4.917333ms GPU, 0.53s total GPU, 107x
```
2021-12-19 03:07:17 -05:00
Allison Vacanti
5d70492714
Enable more warning flags.
...
- /W4 on MSVC
- -Wall -Wextra + others on gcc/clang
- New NVBench_ENABLE_WERROR option to toggle "warnings as errors"
- Mark the nlohmann_json library as IMPORTED to switch to system includes
- Rename nvbench_main -> nvbench.main to follow target name conventions
- Explicitly suppress some cudafe warnings when compiling templates in
nlohmann_json headers.
- Explicitly suppress some warnings from Thrust headers.
- Various fixes for warnings exposed by new flags.
- Disable CUPTI on CTK < 11.3 (See #52 ).
2021-12-18 20:13:25 -05:00
Allison Vacanti
15edfe2eee
Refactor to use NVBENCH_THROW where possible.
2021-12-18 17:52:39 -05:00
Allison Vacanti
9ff857ee29
Merge pull request #49 from senior-zero/fix_markdown_table
...
Fix markdown table
2021-12-18 10:33:11 -05:00
Georgy Evtushenko
eb29ab27ff
Fix markdown table
2021-12-18 18:08:29 +03:00
Georgy Evtushenko
21ea12cd10
Merge pull request #29 from senior-zero/main-feature/github/cupti
...
CUPTI support
2021-12-18 12:09:25 +03:00
Georgy Evtushenko
1bc715267c
CUPTI support
2021-12-18 12:03:52 +03:00
Allison Vacanti
3d6c16f8ba
Maintain iterator state in markdown table printer.
2021-12-18 01:27:38 -05:00
Allison Vacanti
07e1c56608
Merge pull request #46 from allisonvacanti/nvml
...
Add NVML support for persistence mode, locking clocks.
2021-12-17 16:07:44 -05:00
Allison Vacanti
b948e79cab
Add NVML support for persistence mode, locking clocks.
...
Locking clocks is currently only implemented for Volta+ devices.
Example usage:
my_bench -d [0,1,3] --persistence-mode 1 --lock-gpu-clocks base
See the cli_help.md docs for more info.
2021-12-17 13:59:43 -05:00
Robert Maynard
f9b44378bf
nvbench_compare now supports comparing directories of results
2021-12-16 16:26:13 -05:00
Robert Maynard
905f84272e
Add --threshold-diff command option to nvbench_compare
...
Allows us to filter output to only see the significantly different
benchmarks
2021-12-16 15:52:30 -05:00
Robert Maynard
52d9aed8da
refactor to have a proper main entry point
2021-12-16 15:27:51 -05:00
Robert Maynard
3f6d496824
Add a requirements.txt for the nv_bench script
2021-12-16 13:44:40 -05:00
Allison Vacanti
d0c90ff920
Build static fmtlib with -fPIC.
2021-12-15 12:54:53 -05:00
Allison Vacanti
af03585543
Add coloring to markdown tables.
2021-12-14 23:03:14 -05:00
Allison Vacanti
8d77dc2b6c
Merge pull request #47 from allisonvacanti/base-two-bandwidth
...
Use base2 format for displaying bandwidth.
2021-12-14 21:22:50 -05:00
Allison Vacanti
54fda533e1
Use base2 format for displaying bandwidth.
...
Fixes #4 .
2021-12-14 21:19:10 -05:00
Allison Vacanti
7c740975dd
Force fmt to build static libs.
...
Otherwise it shows up in our export set when a parent project enables
BUILD_SHARED_LIBS
2021-10-28 12:39:14 -04:00
Allison Vacanti
cda8d320cb
Merge pull request #44 from allisonvacanti/fix_for_conda
...
Don't explicitly link with cudart.
2021-10-27 12:17:09 -04:00
Allison Vacanti
f984efdc26
Don't explicitly link with cudart.
...
This is implicitly added by nvcc, and the explicit setting was breaking
environments where cudart_static is unavailable, e.g. conda.
2021-10-27 12:13:32 -04:00
Allison Vacanti
611385b047
Print version info with --help.
2021-10-26 17:45:33 -04:00
Allison Vacanti
1875d9962d
Document new --version option.
2021-10-26 17:45:20 -04:00
Allison Vacanti
e6b5f51f1c
Merge pull request #42 from allisonvacanti/rapids-cmake
...
Port to rapids-cmake
2021-10-26 17:26:08 -04:00
Allison Vacanti
b2d37c21fd
Add export tests.
2021-10-20 14:02:16 -04:00
Allison Vacanti
27b23eeb46
Add new --version option to benchmark executables.
2021-10-20 14:02:16 -04:00
Allison Vacanti
ef36d3a558
Port to rapids-cmake.
...
- Add export sets
- Add install rules
- Remove manual CPM import, port to rapids_cpm_*, etc
- Organize CMake code into cmake/*.cmake files.
- NVBench is now a shared library.
2021-10-20 14:02:16 -04:00
Allison Vacanti
ed27365a41
Disable portion of test due to GCC 7 bug.
...
Fixes #39 .
old-cmake
2021-10-19 12:26:02 -04:00
Allison Vacanti
72f9cd8adb
Merge pull request #37 from allisonvacanti/fix-axis
...
Revert .cu -> .cxx change for option_parser TU.
2021-10-13 17:36:36 -04:00
Allison Vacanti
e11c6961b2
Revert .cu -> .cxx change for option_parser TU.
...
This change introduced a strange bug on GCC where a stack object is changed
seemingly randomly.
In `option_parser::update_axis`, the `flag` variable's `data()` pointer
is overwritten to point at garbage memory shortly after parsing. Nothing
related to this object is being updated when the corruption occurs.
This has been observed on gcc7+, but cannot be reproduced on MSVC.
Reverting the `option_parser.cxx` TU to be a CUDA object works
around this.
Fixes issue #36 .
2021-10-13 17:36:01 -04:00
Jake Hemstad
2b8ef7442b
Merge pull request #34 from jrhemstad/add_contributor_guide
...
Update README with info on examples/tests
2021-10-08 14:47:13 -05:00
Jake Hemstad
83a021181d
Missing space.
2021-10-08 14:30:38 -05:00
Jake Hemstad
156967f6c3
Format.
2021-10-08 14:28:33 -05:00
Jake Hemstad
4f0c72b6bf
Delete extra line.
2021-10-08 14:28:23 -05:00
Jake Hemstad
add1a34a04
Add sections for building examples and tests.
2021-10-08 13:03:00 -05:00
Jake Hemstad
131f557b53
Add .gitignore.
2021-10-08 13:02:22 -05:00