nvbench

mirror of https://github.com/NVIDIA/nvbench.git synced 2026-04-19 22:38:52 +00:00

Author	SHA1	Message	Date
Oleksandr Pavlyk	9a91b9ef0c	Reworked cupti_profiler to use Host + Range Profiler APIs end-to-end (#327 ) * Reworked cupti_profiler to use Host + Range Profiler APIs end-to-end NVPW_* API has been deprecated since CTK 13.0. Followed advice in compliation message to replace NVPW_* API with CUPTI Profiler Host API. `libnvbench.so` no longer links to `nvperf_host` directly, only to `libcupti`. NVBench uses the CUPTI Host API to build a config image from metric names, and the Range Profiler API to collect and decode counters. The host API never collects data directly; it prepares and evaluates data produced by range profiling. Introduce `host_impl`/`profiler_init_guard` to manage CUPTI Host object and initialization/deinitialization, including safe move-assignment cleanup. `profiler_init_guard` initializes profiler, and throws if CUPTI returns an error code. `profiler_init_guard::finalize_profiler()` de-inits profiler and returns the error code. Destructor calls finalize_profiler, but ignores the status code. If user wants to explicitly de-initialize profiler and handle the error, he/she is advised to call `finalize_profiler()` directly. The guard has a boolean member variable to allow destructor to work even if user explicitly called finalize_profiler() method. The old counter-data prefix/scratch flow was replaced with the Range Profiler counter data image sizing/initialization path and decode flow. Host API metric filtering (base metrics + context scope) and Host-side evaluation to GPU values via cuptiProfilerHostEvaluateToGpuValues is implemented. - Host object: `host_impl::object` in `nvbench/cupti_profiler.cxx`. - Range profiler object: `host_impl::range_profiler_object`. - Config image: `m_config_image`. - Counter data image: `m_data_image`. 1) Host init + config image - `initialize_profiler_host()` creates the host object. - `initialize_config_image_host()` adds metrics and builds the config image. 2) Range profiler enable + counter data image - `enable_range_profiler()` creates the range profiler object. - `initialize_counter_data_image()` sizes and initializes the data image using the range profiler object, matching the CUPTI samples. 3) Config + collect + decode - `set_range_profiler_config()` binds the config image + data image. - `start_user_loop()` / `stop_user_loop()` push/pop the user range and start/stop the range profiler. - `process_user_loop()` decodes counter data via `cuptiRangeProfilerDecodeData()`. 4) Evaluate metrics - `get_counter_values()` calls `cuptiProfilerHostEvaluateToGpuValues()` to convert counter data into metric values. The * Use class instead of struct in profiler_init_guard; forward declaration * Add SFINAE guards before accessing members not present in earlier CTK versions * Check if cupti_profiler_host.h exists, use old/new implementation based on that check 1. Reintroduced legacy `cupti_profiler_nvpw.cuh` and `cupti_profiler_nvpw.cuh`. 2. Moved profiler-host-API implementation to `cupti_profiler_host.cuh`, `cupti_profiler_host.cxx`. 3. Add `nvbench/cupti_profiler.cuh` which checks if `cupti_profiler_host.h` header is known and includes `cupti_profiler_host.cuh` or `cupti_profiler_nvpw.cuh` respectively. 4. In cmake, we check if ${nvbench_cupti_root}/include/cupti_profiler_host.h file exists. If it does not, `libnvbench.so` would have dependency on libnvperf_host and libnvperf_target in addition to dependency on libcupti. If the header exists, it would only depend on libcupti	2026-03-23 11:51:16 -04:00
Allison Piper	9b133a94bc	Remove GLOBAL tags from fmt targets. (#281 ) Fixes #279.	2025-10-21 11:16:44 -04:00
Allison Piper	e6283df79c	Build native arch by default, update rapids-cmake. (#280 ) * Build native arch by default, update rapids-cmake. * Add check that CXX and CUDA_HOST compiler match. Similar to CCCL, we need these to match to ensure that our warning flag detection functions properly. * GCC only recognizes `unused-local-typedefs`. Clang recognizes both. Ensure that we set this for both compilers.	2025-10-21 10:41:36 -04:00
Allison Piper	8e3e0ad117	Include RAPIDS.cmake to WAR network issues on CI. (#236 ) See also https://github.com/rapidsai/rmm/pull/1886	2025-06-24 17:03:30 -04:00
Oleksandr Pavlyk	b1551d2eb7	Update json and fmt projects to latest versions (#229 )	2025-05-27 12:49:35 -04:00
Allison Piper	26f52a7175	Add cupti paths to INSTALL_RPATH. (#230 )	2025-05-22 12:56:22 -04:00
Allison Piper	109449438b	Use the new(ish) PREFIX option of rapids-cmake version, git revision header utils. Generate macros prefixed with NVBENCH instead of redefining them from NVBench.	2025-04-15 20:33:06 +00:00
Allison Piper	3440855dbd	Formatting updates.	2025-04-14 17:26:12 +00:00
Allison Piper	93ea533fd3	Drop support for MSVC.	2025-04-04 22:17:03 +00:00
Allison Piper	7d210614f5	Attempt to suppress system include warnings on MSVC.	2025-04-04 17:44:33 -04:00
Allison Piper	15d34106d4	Disable unicode in fmtlib on nvcc + msvc. This doesn't appear to be supported.	2025-04-04 17:44:33 -04:00
Allison Piper	8478f7d0bf	Guard fmt def behind nvcc check.	2025-04-04 17:44:33 -04:00
Allison Piper	5f6f8a65ee	Enable /utf-8 on MSVC.	2025-04-04 17:44:33 -04:00
Allison Piper	0e8089a246	Disable fmtlib's use of llvm _BitInt, as it is not supported when using nvcc.	2025-04-04 17:44:33 -04:00
Allison Piper	e6705e3114	Update fmtlib/fmt to 11.1.4. Switched away from the rapids-cmake provided version and manually CPM'd it. rapids-cmake will stop providing fmtlib later this year, and the version currently supported is rather old. Included the same logic that rapids-cmake currently uses to hopefully provide a smooth transition for edge cases (external fmt, etc). Added `FMT_SYSTEM_HEADERS=ON` to mark fmt headers as system includes, suppressing any internal warnings.	2025-04-04 17:44:33 -04:00
Allison Piper	5aa5a3c225	Update rapids-cmake to 25.04.	2025-04-04 17:44:33 -04:00
Allison Piper	60761e0946	Enable extra NVBench features in windows build. (#169 ) * Enable extra NVBench features in windows build. These were delayed as they required changes to the devcontainers. * Revamp nvml.dll logic.	2024-04-10 13:45:53 -04:00
Allison Piper	a0f2fab72b	Squashed commit of the following: commit `c5b2fc0a8b` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 21:48:20 2024 +0000 Add supported compilers and tools in README.md. commit `92fe366da5` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 20:45:30 2024 +0000 Fix issues discovered by header tests. commit `f7f6c92143` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 20:45:06 2024 +0000 Setup header tests, add C++20 header tests + examples. The core library will always be built with C++17, but we test our headers / examples under 17 and 20. commit `4b24f26b66` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 16:21:42 2024 +0000 Pass CUDA FLAGS to install tests. commit `4fb672ae91` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 15:43:41 2024 +0000 Add newer GCC (13) and Clang (17, 18).	2024-04-06 22:05:40 +00:00
Allison Piper	e8c8877d36	Squashed commit of the following: commit `4b309e6ad8` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 13:19:14 2024 +0000 Minor cleanups commit `476ed2ceae` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 12:53:37 2024 +0000 WAR compiler ice in nlohmann json. Only seeing this on GCC 9 + CTK 11.1. Seems to be having trouble with the `[[no_unique_address]]` optimization. commit `a9bf1d3e42` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 00:24:47 2024 +0000 Bump nlohmann json. commit `80980fe373` Author: Allison Piper <alliepiper16@gmail.com> Date: Sat Apr 6 00:22:07 2024 +0000 Fix llvm filesystem support commit `f6099e6311` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 23:18:44 2024 +0000 Drop MSVC 2017 testing. commit `5ae50a8ef5` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 23:02:32 2024 +0000 Add mroe missing headers. commit `b2a9ae04d9` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 22:37:56 2024 +0000 Remove old CUDA+MSVC builds and make windows build-only. commit `5b18c26a28` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 22:37:07 2024 +0000 Fix header for std::min/max. Why do I always think it's utility instead of algorithm.... commit `6a409efa2d` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 22:18:18 2024 +0000 Temporarily disable CUPTI on all windows builds. commit `f432f88866` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 21:42:52 2024 +0000 Fix warnings on MSVC. commit `829787649b` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 21:03:16 2024 +0000 More flailing about in powershell. commit `21742e6bea` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 20:36:08 2024 +0000 Cleanup filesystem header handling. commit `de3d202635` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 20:09:00 2024 +0000 Windows CI debugging. commit `a4151667ff` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 19:45:40 2024 +0000 Quotation mark madness commit `dd04f3befe` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 19:27:27 2024 +0000 Temporarily disable NVML on windows CI until new containers are ready. commit `f3952848c4` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 19:25:22 2024 +0000 WAR issues on gcc-7. commit `198986875e` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 19:25:04 2024 +0000 More matrix/devcontainer updates. commit `b9712f8696` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 18:30:35 2024 +0000 Fix windows build scripts. commit `943f268280` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 18:18:33 2024 +0000 Fix warnings with clang host compiler. commit `7063e1d60a` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 18:14:28 2024 +0000 More devcontainer hijinks. commit `06532fde81` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 17:51:25 2024 +0000 More matrix updates. commit `78a265ea55` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 17:34:00 2024 +0000 Support CLI CMake options for windows ci scripts. commit `670895c867` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 17:31:59 2024 +0000 Add missing devcontainers. commit `b121823e74` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 17:22:54 2024 +0000 Build for `all-major` architectures in presets. We can get away with this because we require CMake 3.23.1. This was added in 3.23. commit `fccfd44685` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 17:22:08 2024 +0000 Update matrix file. commit `e7d43ba90e` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 16:23:48 2024 +0000 Consolidate build/test jobs. commit `c4044056ec` Author: Allison Piper <alliepiper16@gmail.com> Date: Fri Apr 5 16:04:11 2024 +0000 Add missing build script.	2024-04-06 13:56:10 +00:00
Allison Piper	04b70059b8	Setup clangd compile commands output.	2024-04-05 15:08:04 +00:00
Allison Piper	eb5940c64f	Add initial CI, presets, and devcontainers.	2024-04-04 21:42:43 +00:00
Robert Maynard	adaef09b20	Support static builds of nvbench with nvml enabled. To do this we need to ensure that the nvml init handler is both contained in the library/executable that uses nvbench. The original implementation fails since the singleton can be dropped since it has no usages. So instead we move to a function static which we ensure will always be used.	2023-11-14 14:08:10 -05:00
Robert Maynard	e47d7ac354	write_git_revision_file must be used in same CMakeLists as consumer (#143 ) * write_git_revision_file must be used in same CMakeLists as consumer So we can't have this in the rapids-cmake init function. * Fix whitespace damage --------- Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>	2023-10-19 06:52:17 +02:00
Robert Maynard	282cee0f3a	NVBench now supports not installing itself	2023-10-17 15:09:39 -04:00
Robert Maynard	0eab168664	Support users which want static builds of nvbench (#140 )	2023-10-17 13:55:30 -04:00
Georgy Evtushenko	013d266974	Fix gnu line marker warning	2023-06-28 17:03:31 +04:00
Robert Maynard	b8b5d2904b	Handle use case where we are in a conda env but with a static fmt lib	2023-05-31 10:51:40 -04:00
Robert Maynard	b8739b6fe6	Update nvbench default fmt to be built to be 9.1.0 The formatting of `{}` can be incorrect under 7.X when given doubles and compiled with the latest conda toolchain. While both fmt 8 and 9 don't show this issue move to the latest version to leverage all the improvements in fmt 9. Fixes #103	2022-12-16 15:04:49 -05:00
Vyas Ramasubramani	a5ffad1e8d	Downgrade fmt version again.	2022-11-08 11:13:45 -08:00
Vyas Ramasubramani	ae6ede15d6	Fix warning.	2022-11-03 13:59:17 -07:00
Vyas Ramasubramani	a3b729bca8	fmt::memory_buffer is no longer an iterator.	2022-11-03 10:04:02 -07:00
Robert Maynard	8919728d32	Update to latest version of rapids Also ensure that we don't clobber any existing rapids.cmake file	2022-08-08 13:24:29 -04:00
Jonas Hahnfeld	449cd4e275	Allow using local nlohmann_json installation Use the nlohmann_json::nlohmann_json if available, otherwise fall back to add the downloaded headers. Closes #19	2022-08-05 09:57:56 +02:00
Allison Vacanti	348acbd6eb	Use experimental/filesystem on GCC.	2022-01-11 17:19:55 -05:00
Allison Vacanti	288b1564e0	Suppress warnings on MSVC Debug builds. Also moved the config.cuh.in template into the source directory where it'll be easier to find.	2021-12-21 19:35:23 -05:00
Allison Vacanti	edf2018fd7	Merge pull request #58 from allisonvacanti/nvbench_executable Add an `nvbench-ctl` executable.	2021-12-21 12:08:39 -05:00
Allison Vacanti	20522c807d	Add an `nvbench-ctl` executable. This will provide functionality such as clock locking (--lgm), persistance mode (--pm), device querying (--list), version checking (--version), and documentation (--help). This is possible already with any nvbench executable, but having one with a reliable name will be helpful for scripting and writing documentation.	2021-12-21 12:02:07 -05:00
Allison Vacanti	61d094abf1	Add cupti path for ubuntu packages. Fixes #59	2021-12-20 14:34:12 -05:00
Allison Vacanti	5d70492714	Enable more warning flags. - /W4 on MSVC - -Wall -Wextra + others on gcc/clang - New NVBench_ENABLE_WERROR option to toggle "warnings as errors" - Mark the nlohmann_json library as IMPORTED to switch to system includes - Rename nvbench_main -> nvbench.main to follow target name conventions - Explicitly suppress some cudafe warnings when compiling templates in nlohmann_json headers. - Explicitly suppress some warnings from Thrust headers. - Various fixes for warnings exposed by new flags. - Disable CUPTI on CTK < 11.3 (See #52).	2021-12-18 20:13:25 -05:00
Georgy Evtushenko	1bc715267c	CUPTI support	2021-12-18 12:03:52 +03:00
Allison Vacanti	b948e79cab	Add NVML support for persistence mode, locking clocks. Locking clocks is currently only implemented for Volta+ devices. Example usage: my_bench -d [0,1,3] --persistence-mode 1 --lock-gpu-clocks base See the cli_help.md docs for more info.	2021-12-17 13:59:43 -05:00
Allison Vacanti	d0c90ff920	Build static fmtlib with -fPIC.	2021-12-15 12:54:53 -05:00
Allison Vacanti	7c740975dd	Force fmt to build static libs. Otherwise it shows up in our export set when a parent project enables BUILD_SHARED_LIBS	2021-10-28 12:39:14 -04:00
Allison Vacanti	f984efdc26	Don't explicitly link with cudart. This is implicitly added by nvcc, and the explicit setting was breaking environments where cudart_static is unavailable, e.g. conda.	2021-10-27 12:13:32 -04:00
Allison Vacanti	b2d37c21fd	Add export tests.	2021-10-20 14:02:16 -04:00
Allison Vacanti	ef36d3a558	Port to rapids-cmake. - Add export sets - Add install rules - Remove manual CPM import, port to rapids_cpm_, etc - Organize CMake code into cmake/.cmake files. - NVBench is now a shared library.	2021-10-20 14:02:16 -04:00
Robert Maynard	9cfb15be8b	document why we are patching json.hpp	2021-07-09 11:28:43 -04:00
Robert Maynard	237363784a	Patch nlohmann json to avoid a compiler bug in nvcc 11.0 Fixes #18	2021-07-08 16:52:52 -04:00
Allison Vacanti	8b7a2e86b8	Avoid recompiling option_parser every time cmake runs. Switch to `configure_file`, which won't touch the output file unless the contents change.	2021-03-18 16:07:40 -04:00
Allison Vacanti	8afba6c86a	Use CMake to generate --help strings from markdown docs.	2021-03-05 16:37:18 -05:00

1 2

51 Commits