Commit Graph

364 Commits

Author SHA1 Message Date
Robert Maynard
8919728d32 Update to latest version of rapids
Also ensure that we don't clobber any existing rapids.cmake file
2022-08-08 13:24:29 -04:00
Allison Vacanti
9630a081e6 Merge pull request #93 from hahnjo/local-json
Allow using local nlohmann_json installation
2022-08-05 13:14:23 -04:00
Jonas Hahnfeld
449cd4e275 Allow using local nlohmann_json installation
Use the nlohmann_json::nlohmann_json if available, otherwise fall
back to add the downloaded headers.

Closes #19
2022-08-05 09:57:56 +02:00
Allison Vacanti
761ded142a Merge pull request #89 from PointKernel/patch-1
Fix a typo in README.md
2022-06-06 12:48:59 -04:00
Yunsong Wang
46a2dc2856 Update README.md 2022-06-06 08:59:32 -04:00
Allison Vacanti
12d13bdc5e Merge pull request #85 from pauleonix/main
Add --disable-blocking-kernel and --profile options.
2022-04-26 13:23:58 -04:00
pauleonix
79912d7b5f Fix no_block_tags
Co-authored-by: Allison Vacanti <alliepiper16@gmail.com>
2022-04-26 13:44:19 +02:00
Paul Große-Bley
2b5662ea4a Rename [get|set|m]_no_block to [get|set|m]_disable_blocking_kernel in public APIs 2022-04-26 13:40:46 +02:00
Paul Große-Bley
7f51ead595 Add --disable-blocking-kernel and --profile options. 2022-04-08 20:03:44 +02:00
Allison Vacanti
9eed5ab9c3 Merge pull request #79 from PointKernel/fix-config-count-bug
Fix a bug in config count unit test: count number of devices as well
2022-02-18 16:54:49 -05:00
Allison Vacanti
9d655fc48e Improve diagnostic when failing to lock old cards' clocks.
The issue is that the APIs we currently use don't support older
hardware. Users can still lookup the desired frequency for their
HW and manually lock clocks with nvidia-smi.
2022-02-15 14:38:19 -05:00
Allison Vacanti
9d0b2230bc Use SM version instead of PTX version when reporting HW capabilities. 2022-02-15 14:36:40 -05:00
Yunsong Wang
af4c35d78b Fix a bug in config count unit test: count number of devices as well 2022-02-11 18:24:58 -05:00
Allison Vacanti
48d94259b4 Fix typo in new docs. 2022-02-11 14:01:49 -05:00
Allison Vacanti
6c2c53ed4a Reduce time spent smoketesting examples. 2022-02-11 13:54:40 -05:00
Allison Vacanti
19961206e2 Run tests in parallel. 2022-02-11 13:54:22 -05:00
Allison Vacanti
38cecd5f76 Merge pull request #76 from PointKernel/add-implicit-stream-support
Add implicit stream benchmarking support
2022-02-11 13:38:06 -05:00
Allison Vacanti
039d455727 Move documentation on streams to new subsection.
Also update to use `nvbench::make_cuda_stream_view`.
2022-02-11 13:29:06 -05:00
Allison Vacanti
3b41387637 Add nvbench::make_cuda_stream_view(cudaStream_t). 2022-02-11 13:26:33 -05:00
Allison Vacanti
8ae58981ca Add docs for launch and cuda_stream. 2022-02-11 13:25:41 -05:00
Allison Vacanti
da2ec38cdb Exclude some bits from clang-format. 2022-02-11 13:20:05 -05:00
Yunsong Wang
fde2e408de Add stream benchmark example 2022-02-07 13:09:35 -05:00
Yunsong Wang
6159d9c6cb Minor correction in unit test 2022-02-06 20:19:21 -05:00
Yunsong Wang
e05bf002f7 Use unique_ptr + custom deleter to simplify destroy logic 2022-02-06 20:14:41 -05:00
Yunsong Wang
e7c29c1c1b Update docs 2022-02-06 19:34:57 -05:00
Yunsong Wang
a2a12c689c Update docs/benchmarks.md
Co-authored-by: Jake Hemstad <jhemstad@nvidia.com>
2022-02-06 19:31:20 -05:00
Yunsong Wang
33a896f99e Update copyright year 2022-02-04 17:25:50 -05:00
Yunsong Wang
76cbbcc8f9 Update benchmarks.md 2022-02-04 17:20:40 -05:00
Yunsong Wang
470beda9f0 Add nvbench::state stream tests 2022-02-04 16:55:29 -05:00
Yunsong Wang
439ffec1c8 Minor correction 2022-02-04 16:35:55 -05:00
Yunsong Wang
86708ec793 Fix a stream destroy bug 2022-02-04 16:03:52 -05:00
Yunsong Wang
14eab0774a Update measure_* classes to construct launch from the state cuda stream 2022-02-04 14:16:43 -05:00
Yunsong Wang
c510a0e78c Update launch to hold a const ref of nvbenc::cuda_stream 2022-02-04 13:56:02 -05:00
Yunsong Wang
8aea3e467e Add a cuda stream member to nvbench::state 2022-02-04 13:51:30 -05:00
Yunsong Wang
15f2e92fdf Add owning and non-owning semantics to nvbench::cuda_stream 2022-02-04 13:26:00 -05:00
Allison Vacanti
b1b6d73afa Merge pull request #74 from S-o-T/fix_for_fmt8
Add missing formatter
2022-01-28 11:48:26 -05:00
Mark Shachkov
c9b1bdaf00 Cast axis_type to string prior to formatting 2022-01-28 10:11:21 +03:00
Allison Vacanti
a72f248af6 Require the NVBench package in test_export testing. 2022-01-19 15:42:26 -05:00
Allison Vacanti
aa64dac60f More updates to CI config. 2022-01-17 21:54:07 -05:00
Allison Vacanti
542a10e843 Bump default image to gcc9. 2022-01-17 17:59:07 -05:00
Allison Vacanti
6fe34737de Merge pull request #72 from allisonvacanti/gpuci
Add gpuCI metafiles.
2022-01-17 17:30:09 -05:00
Allison Vacanti
15371eb958 Add gpuCI metafiles. 2022-01-17 17:28:10 -05:00
Allison Vacanti
66ae331452 Update .gitignore 2022-01-17 17:02:53 -05:00
Allison Vacanti
a06a7c668c Merge pull request #70 from allisonvacanti/walltime_reports
Python / JSON updates
2022-01-13 17:13:24 -05:00
Allison Vacanti
39ffc84ee3 Merge pull request #69 from PointKernel/remove-unused-parameter-error
Get rid of unused parameter errors
2022-01-12 10:43:38 -05:00
Allison Vacanti
12df504bb6 Fix progress display for multiple device runs. 2022-01-11 17:58:05 -05:00
Allison Vacanti
8ba2cf1395 Update compare script and test files for new JSON. 2022-01-11 17:55:36 -05:00
Yunsong Wang
d1c82d00bf Get rid of unused parameter error 2022-01-11 17:20:17 -05:00
Allison Vacanti
348acbd6eb Use experimental/filesystem on GCC. 2022-01-11 17:19:55 -05:00
Allison Vacanti
a6925f3c2b Log commandline args in JSON output. 2022-01-11 16:30:37 -05:00