nvbench

mirror of https://github.com/NVIDIA/nvbench.git synced 2026-03-14 20:27:24 +00:00

Author	SHA1	Message	Date
Oleksandr Pavlyk	cfb4a9b8b0	Fix for comment grammar	2026-02-02 12:58:15 -06:00
Oleksandr Pavlyk	27d6492355	Factor out check for whether to skip hot measurement to a nvbench::state private method	2026-02-02 12:43:39 -06:00
Oleksandr Pavlyk	cff6df9bb2	Renamed option to --no-batch to stay aligned with tag name	2026-02-02 12:28:39 -06:00
Oleksandr Pavlyk	f1b9d44304	Support --no-batched CLI option The option sets m_skip_batched boolean member in benchmark_base class. Methods `bool get_skip_batched()` and `void set_skip_batched(bool)` added. m_skip_batched is also added to state class. Similarly named methods are added. CLI help file documents `--no-batched` option.	2026-02-02 11:32:57 -06:00
Oleksandr Pavlyk	f3fa93f388	Merge pull request #290 from oleksandr-pavlyk/debug/outstanding-changes Make python nvbench benchmarks interruptible	2026-01-23 15:39:23 -06:00
Bernhard Manfred Gruber	2d4690e07d	Merge pull request #298 from bernhardmgruber/ignore_device Allow to by-pass device section check and compare different devices	2025-12-10 18:24:26 +01:00
Bernhard Manfred Gruber	85548809d6	Allow to by-pass device section check and compare different devices Fixes: #297	2025-12-10 13:14:50 +01:00
Oleksandr Pavlyk	f6a9b245d3	Only trigger skipping of outstanding benchmarks on KeyboardInterrupt exception, on others benchmakr is to continue execution	2025-12-08 14:46:59 -06:00
Oleksandr Pavlyk	7e9a9a8983	Replace main_arg_run_benchmarks with run_interriptible This loop uses benchmark.run_or_skip to resolve #284 even for scripts that contain more than one benchmark, or when a script with a single benchmark is executed when more than one device is available.	2025-12-08 14:29:27 -06:00
Oleksandr Pavlyk	8e6154511e	Introduce runner->run_or_skip(bool &) and benchmark->run_or_skip(bool &) These methods take reference to a boolean whose value signals whether benchmark instances pending for execution are to be skipped. void benchmark->run_or_skip(bool &) is called by Python to ensure that KeyboardInterrupt is properly handled in scripts that contain multiple benchmarks, or in case when single benchmark script is executed on a machine with more than one device.	2025-12-08 14:24:32 -06:00
Oleksandr Pavlyk	a7763bdd7a	Remove debug outputs	2025-12-08 12:25:31 -06:00
Oleksandr Pavlyk	b2a80c92b8	Revert "Scripts to triage 284" This reverts commit `c286199adc`.	2025-12-08 11:53:08 -06:00
Oleksandr Pavlyk	ce9a76167f	Use nvbench::stop_runner_loop to signal stop of runner loop Add try/catch around Python calls to improve keyboard interrup response.	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	e57f1ecf4c	Introduce nvbench::stop_runner_loop exception. If application throws it, runner loop is stopped and other pending benchmark instances are skipped	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	c286199adc	Scripts to triage 284	2025-12-05 19:38:11 -06:00
Oleksandr Pavlyk	de471e1d42	Use pybind11==3.0.1, do not use pybind11_add_module	2025-12-05 19:38:11 -06:00
Jerry Hou	f651636501	entropy criterion optimizations (#286 ) * entropy criterion optimizations * online linear regression module * online regression refactor * revising ss_tot handling --------- Co-authored-by: Jerry Hou <jerryhou@fb.com>	2025-12-06 01:02:21 +00:00
Ashwin Srinath	a6995413ac	Merge pull request #288 from shwina/wheel-build-and-publish-infra Initial wheel build and publishing infrastructure	2025-12-04 04:37:07 -05:00
Ashwin Srinath	1d33536ce1	Re-enable other CI jobs	2025-12-03 16:42:30 -05:00
Ashwin Srinath	603a2df445	Remove workaround	2025-12-03 16:23:42 -05:00
Ashwin Srinath	77b7afc3c9	Remove the Python version file	2025-12-03 16:23:14 -05:00
Ashwin Srinath	3af11c8ee7	Expand the CI matrix back	2025-12-03 15:48:40 -05:00
Ashwin Srinath	cadfa7de61	We no longer need to install libnvidia-ml.so	2025-12-03 15:37:20 -05:00
Ashwin Srinath	7ad064ea4f	Change to GPU runner for testing	2025-12-03 15:18:39 -05:00
Ashwin Srinath	b7eaf44ca3	Install libnvidia-ml.so.1 in test environment	2025-12-03 14:56:37 -05:00
Ashwin Srinath	c2c34c9378	Temporarily reduce CI matrix	2025-12-03 14:37:23 -05:00
Ashwin Srinath	a293af1d52	Try capturing the Python path before changing directories	2025-12-03 14:15:34 -05:00
Ashwin Srinath	a7f92b7436	Try an inner and outer script	2025-12-03 13:21:53 -05:00
Ashwin Srinath	9746aa14df	Maybe fix to test script	2025-12-03 12:47:43 -05:00
Ashwin Srinath	d1efef03bc	Fix wheel naming	2025-12-03 11:54:46 -05:00
Ashwin Srinath	618001143b	Fixes to test script	2025-12-03 11:41:36 -05:00
Ashwin Srinath	8443a2059c	Ensure test jobs find wheels correctly	2025-12-03 11:22:19 -05:00
Ashwin Srinath	f3df4104de	Make wheels manylinux compliant	2025-12-03 11:22:12 -05:00
Ashwin Srinath	e15d9ebf58	Lint fixes	2025-12-03 11:07:03 -05:00
Ashwin Srinath	98e0b5994a	Introduce build-and-test-python-wheels workflow	2025-12-03 11:06:11 -05:00
Ashwin Srinath	e9cf53a1a4	Add PR workflow for building and testing wheels	2025-12-03 10:30:27 -05:00
Ashwin Srinath	8b2afa6c16	Lint fixes	2025-12-03 10:17:23 -05:00
Ashwin Srinath	29389b5791	Initial wheel build and publishing infrastructure	2025-12-03 10:15:32 -05:00
Bernhard Manfred Gruber	34f1e2a7ee	Merge pull request #285 from ashermancinelli/patch-1 Update README.md	2025-11-16 00:11:42 +01:00
Asher Mancinelli	e91559edf0	Update README.md	2025-11-14 14:34:18 -08:00
comeyrd	92d2e01cd1	Profile only the kernels involved in the benchmark (#277 ) Co-authored-by: Allison Piper <alliepiper16@gmail.com>	2025-10-21 13:51:37 -04:00
Allison Piper	9b133a94bc	Remove GLOBAL tags from fmt targets. (#281 ) Fixes #279.	2025-10-21 11:16:44 -04:00
Allison Piper	e6283df79c	Build native arch by default, update rapids-cmake. (#280 ) * Build native arch by default, update rapids-cmake. * Add check that CXX and CUDA_HOST compiler match. Similar to CCCL, we need these to match to ensure that our warning flag detection functions properly. * GCC only recognizes `unused-local-typedefs`. Clang recognizes both. Ensure that we set this for both compilers.	2025-10-21 10:41:36 -04:00
Bernhard Manfred Gruber	98d701c054	Diff device sections on mismatch in nvbench_compare.py (#278 )	2025-10-15 08:58:08 -04:00
pre-commit-ci[bot]	7feda2cf3a	[pre-commit.ci] pre-commit autoupdate (#276 ) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0) - [github.com/pre-commit/mirrors-clang-format: v20.1.7 → v21.1.2](https://github.com/pre-commit/mirrors-clang-format/compare/v20.1.7...v21.1.2) - [github.com/astral-sh/ruff-pre-commit: v0.12.2 → v0.13.3](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.2...v0.13.3) * Update matrix + devcontainers. * Fix typo. Co-authored-by: Oleksandr Pavlyk <21087696+oleksandr-pavlyk@users.noreply.github.com> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Allison Piper <alliepiper16@gmail.com> Co-authored-by: Oleksandr Pavlyk <21087696+oleksandr-pavlyk@users.noreply.github.com>	2025-10-07 15:22:36 -04:00
Oleksandr Pavlyk	e7cc1e344c	Add an benchmark example parametrized by typename and integral constant. (#275 ) * Add an benchmark example parametrized by typename and integral constant. Add a variation of copy_type_sweep kernel, where block size is controlled via integral constant passed as template parameter. * Addressed PR review feedback * Use auto to gridSize * Address PR review change request * Add comment to use ceil_div with CCCL >= 2.8	2025-10-07 13:49:17 -04:00
Oleksandr Pavlyk	b88a45f417	Merge pull request #269 from jayavenkatesh19/main remove pynvjitlink references in examples	2025-09-17 13:54:36 -05:00
Jaya Venkatesh	0f997271f7	added numba-cuda to requirements Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>	2025-09-16 14:54:08 -07:00
Jaya Venkatesh	bfa6a6c7c6	remove pynvjitlink references in examples Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>	2025-09-08 16:00:19 -07:00
Allison Piper	4642df7006	Fix sccache checks when running locally. (#268 )	2025-09-05 15:50:09 -04:00

1 2 3 4 5 ...

688 Commits