nvbench

mirror of https://github.com/NVIDIA/nvbench.git synced 2026-03-14 20:27:24 +00:00

Author	SHA1	Message	Date
Oleksandr Pavlyk	c9705de4a4	Reserve enough space clock-rates for min samples, if specified	2026-02-27 12:49:35 -06:00
Oleksandr Pavlyk	998ab125ce	Don't override m_check_throttling if throttling threshold is non-positive measure_cold class now directly inherits m_check_throttling from state. This ensures that when `--jsonbin` is specified frequency data corresponding to timing data are available to write out.	2026-02-20 16:34:53 -06:00
Oleksandr Pavlyk	731e0c2c30	Swapped data members m_sm_clock_rates and m_sm_clock_rate_accumulator This places all std::vector members together. Added default initialization to all std::vector members, and all other members with default constructors. Exceptions are references and nvbench::launch m_launch; member	2026-02-19 15:33:57 -06:00
Oleksandr Pavlyk	4da9f431c0	Templatize write_out_values for different storage formats This could be used to save data as float32_t, or float64_t. This flexibility is useful for experimentation.	2026-02-19 15:32:00 -06:00
Oleksandr Pavlyk	988420b5b1	Use write_out_values utility to save frequencies The utility was already used to save times	2026-02-13 10:19:06 -06:00
Georgy Evtushenko	40b2f4ece2	Better place to stop freq timer?	2026-02-13 09:53:59 -06:00
Georgy Evtushenko	a487a38895	Dump frequencies	2026-02-13 08:49:41 -06:00
Nader Al Awar	dc59f98ecd	Remove cupti from cuda-bench dependencies (#311 ) python-0.2.0	2026-02-03 14:16:26 -06:00
Bernhard Manfred Gruber	90ad8bcbc7	Merge pull request #296 from bernhardmgruber/compare_sub_results Allow partial comparison in `nvbench_compare.py`	2026-02-03 20:02:34 +01:00
Bernhard Manfred Gruber	c6ef87575c	Allow partial comparison in nvbench_compare.py Fixes: #295	2026-02-03 16:32:11 +01:00
Nader Al Awar	d75fc74162	Merge branch 'main' into remove-cupti-python	2026-02-03 08:58:41 -06:00
Oleksandr Pavlyk	867d5d4276	Merge pull request #294 from oleksandr-pavlyk/add-docstrings	2026-02-03 08:51:55 -06:00
Oleksandr Pavlyk	8a128ed7d9	Merge pull request #309 from oleksandr-pavlyk/support-skipping-batched-runs	2026-02-02 17:57:45 -06:00
Nader Al Awar	4fa4296810	Remove cuda.pathfinder function	2026-02-02 16:43:45 -06:00
Nader Al Awar	f2d5730104	Disable CUPTI in cmake file	2026-02-02 16:03:15 -06:00
Nader Al Awar	6df5fc8c67	Remove cupti from cuda-bench dependencies	2026-02-02 15:37:13 -06:00
Oleksandr Pavlyk	a33a454a2d	Make skip_hot_measurement method const	2026-02-02 14:42:07 -06:00
Oleksandr Pavlyk	f049f10977	Fix typo	2026-02-02 14:41:42 -06:00
Oleksandr Pavlyk	cfb4a9b8b0	Fix for comment grammar	2026-02-02 12:58:15 -06:00
Oleksandr Pavlyk	27d6492355	Factor out check for whether to skip hot measurement to a nvbench::state private method	2026-02-02 12:43:39 -06:00
Oleksandr Pavlyk	cff6df9bb2	Renamed option to --no-batch to stay aligned with tag name	2026-02-02 12:28:39 -06:00
Oleksandr Pavlyk	8ff0557ad8	Replace use of py::handle to store global_registry Use py::gil_safe_call_once_and_store facility pybind11 provides.	2026-02-02 11:55:48 -06:00
Oleksandr Pavlyk	39c29026fd	Move docstrings from PYI file to implementation Added tests that docstrings exist and are not empty. This closes #291	2026-02-02 11:55:48 -06:00
Oleksandr Pavlyk	f1b9d44304	Support --no-batched CLI option The option sets m_skip_batched boolean member in benchmark_base class. Methods `bool get_skip_batched()` and `void set_skip_batched(bool)` added. m_skip_batched is also added to state class. Similarly named methods are added. CLI help file documents `--no-batched` option.	2026-02-02 11:32:57 -06:00
Nader Al Awar	34a089f805	Add 89-real to list of architectures built for cuda-bench (#308 )	2026-01-30 13:35:17 -06:00
Nader Al Awar	7b5887a4a6	Add 89-real to list of architectures built	2026-01-30 13:02:42 -06:00
Nader Al Awar	a5ad480dfe	Add installation instructions to `cuda-bench` readme (#307 ) Add installation instructions to `cuda-bench` readme	2026-01-30 10:02:56 -06:00
Nader Al Awar	edf0b80599	Add installation instructions	2026-01-30 09:32:44 -06:00
Nader Al Awar	a29748316d	Fix pypi url to publish wheel (#306 ) Fix pypi url to publish wheel python-0.1.0	2026-01-29 16:03:48 -06:00
Nader Al Awar	bd775c8c14	Use inputs.component for concistency with cuda-cccl	2026-01-29 15:10:46 -06:00
Nader Al Awar	a8e8e176e9	Fix pypi url to publish wheel	2026-01-29 14:57:48 -06:00
Nader Al Awar	f66f76731c	Replace all occurences of pynvbench with cuda-bench (#305 )	2026-01-29 14:13:44 -06:00
Nader Al Awar	fa1eed69c0	Rename test file to refer to cuda_bench	2026-01-29 13:53:29 -06:00
Nader Al Awar	c14a016e40	Replace a few more occurrences	2026-01-29 13:32:09 -06:00
Nader Al Awar	711c1e2eb1	Replace all occurences of pynvbench with cuda-bench	2026-01-29 13:25:17 -06:00
Nader Al Awar	5e7adc5c3f	Build multi architecture cuda wheels (#302 ) * Add cuda architectures to build wheel for * Package scripts in wheel * Separate cuda major version extraction to fix architecutre selection logic * Add back statement printing cuda version * [pre-commit.ci] auto code formatting --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2026-01-29 01:13:24 +00:00
Ashwin Srinath	a681e2185d	Add multi-cuda wheel build (#289 ) Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com> Co-authored-by: Nader Al Awar <naderalawar@gmail.com>	2026-01-28 10:37:55 -05:00
Oleksandr Pavlyk	f3fa93f388	Merge pull request #290 from oleksandr-pavlyk/debug/outstanding-changes Make python nvbench benchmarks interruptible	2026-01-23 15:39:23 -06:00
Bernhard Manfred Gruber	2d4690e07d	Merge pull request #298 from bernhardmgruber/ignore_device Allow to by-pass device section check and compare different devices	2025-12-10 18:24:26 +01:00
Bernhard Manfred Gruber	85548809d6	Allow to by-pass device section check and compare different devices Fixes: #297	2025-12-10 13:14:50 +01:00
Oleksandr Pavlyk	f6a9b245d3	Only trigger skipping of outstanding benchmarks on KeyboardInterrupt exception, on others benchmakr is to continue execution	2025-12-08 14:46:59 -06:00
Oleksandr Pavlyk	7e9a9a8983	Replace main_arg_run_benchmarks with run_interriptible This loop uses benchmark.run_or_skip to resolve #284 even for scripts that contain more than one benchmark, or when a script with a single benchmark is executed when more than one device is available.	2025-12-08 14:29:27 -06:00
Oleksandr Pavlyk	8e6154511e	Introduce runner->run_or_skip(bool &) and benchmark->run_or_skip(bool &) These methods take reference to a boolean whose value signals whether benchmark instances pending for execution are to be skipped. void benchmark->run_or_skip(bool &) is called by Python to ensure that KeyboardInterrupt is properly handled in scripts that contain multiple benchmarks, or in case when single benchmark script is executed on a machine with more than one device.	2025-12-08 14:24:32 -06:00
Oleksandr Pavlyk	a7763bdd7a	Remove debug outputs	2025-12-08 12:25:31 -06:00
Oleksandr Pavlyk	b2a80c92b8	Revert "Scripts to triage 284" This reverts commit `c286199adc`.	2025-12-08 11:53:08 -06:00
Oleksandr Pavlyk	ce9a76167f	Use nvbench::stop_runner_loop to signal stop of runner loop Add try/catch around Python calls to improve keyboard interrup response.	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	e57f1ecf4c	Introduce nvbench::stop_runner_loop exception. If application throws it, runner loop is stopped and other pending benchmark instances are skipped	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	c286199adc	Scripts to triage 284	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	de471e1d42	Use pybind11==3.0.1, do not use pybind11_add_module	2025-12-05 19:38:11 -06:00
Jerry Hou	f651636501	entropy criterion optimizations (#286 ) * entropy criterion optimizations * online linear regression module * online regression refactor * revising ss_tot handling --------- Co-authored-by: Jerry Hou <jerryhou@fb.com>	2025-12-06 01:02:21 +00:00

1 2 3 4 5 ...

721 Commits