nvbench

mirror of https://github.com/NVIDIA/nvbench.git synced 2026-05-12 09:15:47 +00:00

Author	SHA1	Message	Date
Oleksandr Pavlyk	a3364ca5c7	Port changes to the package from #323 (#337 ) Fixed relative text alignment in docstrings to fix autodoc warnigns Renamed cuda.bench.test_cpp_exception and cuda.bench.test_py_exception functions to start with underscore, signaling that these functions are internal and should not be documented Account for test_cpp_exceptions -> _test_cpp_exception, same for _py_ Make sure to reset __module__ of reexported symbols to be cuda.bench	2026-04-22 08:28:15 -05:00
Oleksandr Pavlyk	b0a46f44c2	Modularize color handling (#336 ) * Introduce function colorize to modularize colorization/no-color handling * Use sns.set_theme instead of deprecated sns.set() * Use str.format instead of legacy % syntax * Simplified iteration over list Use f-string (supported since Python 3.6) instead of str.format for better readability and performance	2026-04-14 08:09:44 -05:00
Nader Al Awar	373970323f	Merge pull request #331 from oleksandr-pavlyk/update-python-examples Update python examples	2026-04-02 15:20:24 -04:00
Oleksandr Pavlyk	39730efbc3	Update requirements to reflect packages used by examples	2026-04-02 10:37:17 -05:00
Oleksandr Pavlyk	9f75642387	Add patch to cutlass.base_dsl.dsl.BaseDSL to work-around a bug See https://github.com/NVIDIA/cutlass/issues/3142	2026-04-02 10:29:31 -05:00
Nader Al Awar	7a68e53df0	Rename flag from markdown to no-color	2026-04-01 17:01:29 -05:00
Nader Al Awar	7e5e784855	Add --markdown flag to nvbench_compare.py which can be use for github issues/prs	2026-04-01 14:53:13 -05:00
Oleksandr Pavlyk	93bc59d05c	Renamed CUTLASS example to reflect that it uses CuteDSL	2026-04-01 08:24:29 -05:00
Oleksandr Pavlyk	e4cfddeb87	Rewrote cutlass_gemm example to use CuteDSL	2026-04-01 08:23:41 -05:00
Oleksandr Pavlyk	3f284b4004	Renamed cccl_* examples cccl_parallel_* -> cuda_compute_* cccl_cooperative_* -> cuda_coop_*	2026-04-01 08:20:20 -05:00
Oleksandr Pavlyk	5bdb30f4b6	Update to cccl_parallel_segmented_reduce example per changes in API Update namespace changes. Use make_segmented_reduce factory function, and update call signatures.	2026-04-01 08:18:15 -05:00
Oleksandr Pavlyk	d8739fc208	Update to cccl_cooperative_block_reduce example	2026-04-01 08:17:52 -05:00
Oleksandr Pavlyk	974eb5ee0f	Replace use of cupy.cuda.ExternalStream with cupy.cuda.Stream.from_external	2026-04-01 08:17:12 -05:00
Oleksandr Pavlyk	7c60edcc0a	cuda.core.experimental -> cuda.core	2026-04-01 08:16:04 -05:00
Oleksandr Pavlyk	836a6c12f4	Merge pull request #326 from oleksandr-pavlyk/fix-sfinae-incomplete Fix GCC16 sfinae incomplete warnings. GCC16 started requiring that the type `T` used in `std::reference_wrapper<T>` is complete where using `-std=c++17`. Since NVBench has to forward declare some types in header files to break circular dependency, use of incomplete type breaks build due to use of `-Werror` flag due to `-Wsfinae-incomplete` warning emitted by GCC16. This commit replaced affected uses of `std::reference_wrapper<const nvbench::benchmark_base>` in state.cxx, and `std::reference_wrapper<nvbench::printer_base>` in benchmark_base.cxx with raw pointers.	2026-03-24 16:02:28 -05:00
Bernhard Manfred Gruber	4164909c52	Feedback	2026-02-28 01:19:18 +01:00
Bernhard Manfred Gruber	0abc8ec82b	Extend nvbench_compare.py with `--plot`, axis/benchmark filtering, and dark mode Co-authored-by: Oleksandr Pavlyk <21087696+oleksandr-pavlyk@users.noreply.github.com>	2026-02-27 11:06:20 +01:00
Bernhard Manfred Gruber	800f640c20	Apply reviewer feedback	2026-02-26 19:23:51 +01:00
Bernhard Manfred Gruber	d3a0bec4a8	Feedback from review	2026-02-05 14:13:16 +01:00
Bernhard Manfred Gruber	28ed32bb47	Implement dark mode using style sheets	2026-02-05 14:00:33 +01:00
Bernhard Manfred Gruber	ec9759037d	I have no idea what I am doing	2026-02-05 11:15:27 +01:00
Bernhard Manfred Gruber	ccde9fc4d4	More	2026-02-05 10:56:36 +01:00
Bernhard Manfred Gruber	0be190b407	Add a script to plot benchmark results	2026-02-05 10:36:52 +01:00
Nader Al Awar	dc59f98ecd	Remove cupti from cuda-bench dependencies (#311 )	2026-02-03 14:16:26 -06:00
Bernhard Manfred Gruber	c6ef87575c	Allow partial comparison in nvbench_compare.py Fixes: #295	2026-02-03 16:32:11 +01:00
Nader Al Awar	d75fc74162	Merge branch 'main' into remove-cupti-python	2026-02-03 08:58:41 -06:00
Nader Al Awar	4fa4296810	Remove cuda.pathfinder function	2026-02-02 16:43:45 -06:00
Nader Al Awar	f2d5730104	Disable CUPTI in cmake file	2026-02-02 16:03:15 -06:00
Nader Al Awar	6df5fc8c67	Remove cupti from cuda-bench dependencies	2026-02-02 15:37:13 -06:00
Oleksandr Pavlyk	8ff0557ad8	Replace use of py::handle to store global_registry Use py::gil_safe_call_once_and_store facility pybind11 provides.	2026-02-02 11:55:48 -06:00
Oleksandr Pavlyk	39c29026fd	Move docstrings from PYI file to implementation Added tests that docstrings exist and are not empty. This closes #291	2026-02-02 11:55:48 -06:00
Nader Al Awar	edf0b80599	Add installation instructions	2026-01-30 09:32:44 -06:00
Nader Al Awar	fa1eed69c0	Rename test file to refer to cuda_bench	2026-01-29 13:53:29 -06:00
Nader Al Awar	711c1e2eb1	Replace all occurences of pynvbench with cuda-bench	2026-01-29 13:25:17 -06:00
Nader Al Awar	5e7adc5c3f	Build multi architecture cuda wheels (#302 ) * Add cuda architectures to build wheel for * Package scripts in wheel * Separate cuda major version extraction to fix architecutre selection logic * Add back statement printing cuda version * [pre-commit.ci] auto code formatting --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2026-01-29 01:13:24 +00:00
Ashwin Srinath	a681e2185d	Add multi-cuda wheel build (#289 ) Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com> Co-authored-by: Nader Al Awar <naderalawar@gmail.com>	2026-01-28 10:37:55 -05:00
Oleksandr Pavlyk	f6a9b245d3	Only trigger skipping of outstanding benchmarks on KeyboardInterrupt exception, on others benchmakr is to continue execution	2025-12-08 14:46:59 -06:00
Oleksandr Pavlyk	7e9a9a8983	Replace main_arg_run_benchmarks with run_interriptible This loop uses benchmark.run_or_skip to resolve #284 even for scripts that contain more than one benchmark, or when a script with a single benchmark is executed when more than one device is available.	2025-12-08 14:29:27 -06:00
Oleksandr Pavlyk	a7763bdd7a	Remove debug outputs	2025-12-08 12:25:31 -06:00
Oleksandr Pavlyk	b2a80c92b8	Revert "Scripts to triage 284" This reverts commit `c286199adc`.	2025-12-08 11:53:08 -06:00
Oleksandr Pavlyk	ce9a76167f	Use nvbench::stop_runner_loop to signal stop of runner loop Add try/catch around Python calls to improve keyboard interrup response.	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	c286199adc	Scripts to triage 284	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	de471e1d42	Use pybind11==3.0.1, do not use pybind11_add_module	2025-12-05 19:38:11 -06:00
Ashwin Srinath	77b7afc3c9	Remove the Python version file	2025-12-03 16:23:14 -05:00
Ashwin Srinath	29389b5791	Initial wheel build and publishing infrastructure	2025-12-03 10:15:32 -05:00
Asher Mancinelli	e91559edf0	Update README.md	2025-11-14 14:34:18 -08:00
Jaya Venkatesh	0f997271f7	added numba-cuda to requirements Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>	2025-09-16 14:54:08 -07:00
Jaya Venkatesh	bfa6a6c7c6	remove pynvjitlink references in examples Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>	2025-09-08 16:00:19 -07:00
Oleksandr Pavlyk	b5e4b4ba31	cuda.nvbench -> cuda.bench Per PR review suggestion: - `cuda.parallel` - device-wide algorithms/Thrust - `cuda.cooperative` - Cooperative algorithsm/CUB - `cuda.bench` - Benchmarking/NVBench	2025-08-04 13:42:43 -05:00
Oleksandr Pavlyk	c2a2acc9b6	Change float64_t arg-type for set_throttle_threshold to float32_t The C++ method signature of set_throttle_threshold/set_trottle_recovery_delay, which uses nvbench::float32_t	2025-08-04 12:14:52 -05:00

1 2 3

126 Commits