nvbench

mirror of https://github.com/NVIDIA/nvbench.git synced 2026-03-14 20:27:24 +00:00

Author	SHA1	Message	Date
Allison Vacanti	47c69b83c9	More README cleanup.	2021-03-03 17:29:15 -05:00
Allison Vacanti	d7c34c835d	More README cleanup.	2021-03-03 17:25:12 -05:00
Allison Vacanti	21e13f002d	Fix error in README.	2021-03-03 17:22:35 -05:00
Allison Vacanti	544deaf539	Fix warning.	2021-03-03 16:57:17 -05:00
Allison Vacanti	75439d3ef8	Fix device global arg storage.	2021-03-03 16:49:43 -05:00
Allison Vacanti	9ff22cb12d	Update README to use current macro names.	2021-03-03 16:30:45 -05:00
Allison Vacanti	a6b26ef7be	Add initial README.md.	2021-03-03 16:00:11 -05:00
Allison Vacanti	cf71f6ee15	Update NVBench build system with initial standalone support.	2021-03-03 13:59:29 -05:00
Allison Vacanti	2ce12d2201	More printer cleanups. - Initialize color from env_var. - Change ivar to m_global_benchmark_args to clarify usage.	2021-03-02 17:36:19 -05:00
Allison Vacanti	3f3c648358	Update option_parser for recent refactorings.	2021-03-02 17:13:52 -05:00
Allison Vacanti	c83bf8cdb8	s/output_multiplex/printer_multiplex/g	2021-03-02 17:12:22 -05:00
Allison Vacanti	9d2194f2ab	s/csv_format/csv_printer/g	2021-03-02 17:10:09 -05:00
Allison Vacanti	015b8d1fb1	s/markdown_format/markdown_printer/g	2021-03-02 17:08:54 -05:00
Allison Vacanti	780dc3b649	s/output_format/printer_base/g	2021-03-02 17:06:13 -05:00
Allison Vacanti	4a1e670f50	Use a more robust method to add the stdout printer, add `--quiet`.	2021-03-02 16:57:03 -05:00
Allison Vacanti	9cd0d10fe1	Route log messages through output formats.	2021-03-02 16:36:28 -05:00
Allison Vacanti	52fbbbcc7a	Add --markdown / --csv options to option_parser.	2021-03-01 17:13:11 -05:00
Allison Vacanti	630aefda93	Make `output_format` explicitly move-only.	2021-03-01 17:10:33 -05:00
Allison Vacanti	0112993de2	Add `output_multiplex::get_output_count`.	2021-03-01 17:10:13 -05:00
Allison Vacanti	6c28a6a791	Refactor to keep stream includes out of headers.	2021-03-01 17:09:56 -05:00
Allison Vacanti	33a069af2b	Add output_multiplex. This allows an arbitrary number of output_formats to be wrapped up into a single object. This output format just forwards all calls to its children.	2021-03-01 15:16:43 -05:00
Allison Vacanti	14d41bb7e1	Add initial implementation of csv_format.	2021-02-25 16:10:28 -05:00
Allison Vacanti	17db5d31cc	Split table_builder out from markdown_table.	2021-02-25 16:09:06 -05:00
Allison Vacanti	359db2c592	Initial pass at output_format.cuh, ported markdown_format.	2021-02-23 16:28:30 -05:00
Allison Vacanti	8d6d934dfe	Add default axis names. Also cleaned up the annoying quirk where `set_type_axes_names` had to be called on all benchmarks with type axes. Default names are {"T", "U", "V", "W"} for up-to four type axes. For five or more, {"T0", "T1", ...} is used instead.	2021-02-19 12:37:05 -05:00
Allison Vacanti	324b0d107e	Add "global args" to option parser. If a benchmark modifier is passed before `--benchmark`, the modifier will apply to all benchmarks.	2021-02-19 10:37:00 -05:00
Allison Vacanti	a747982415	Add `nvbench::main` CMake target. Linking to this instead of `nvbench::nvbench` will automatically include the `NVBENCH_MAIN` macro.	2021-02-19 09:34:02 -05:00
Allison Vacanti	2cc9bf41e3	Add demangle<T>() convenience overload.	2021-02-18 23:41:49 -05:00
Allison Vacanti	543488ef75	Make kernel wrapper into an lvalue.	2021-02-18 23:41:23 -05:00
Allison Vacanti	b5443e98c8	Use `std::size_t` for element counts, buffer size metadata.	2021-02-18 18:23:14 -05:00
Allison Vacanti	7dd46b0021	Update old benchmarks to use nvbench, remove old scaffolding. Remove the original attempt to adapt gbench to do CUDA stuff. Update all benchmarks to use some conventions: - Element count -> "Elements" [16:32] - Throughput calcs - Add input buffer column: "Size"	2021-02-18 18:22:50 -05:00
Allison Vacanti	7657036f9c	Add helper methods to configure throughput. Instead of: ``` state.set_element_count(size); state.set_global_memory_bytes_accessed( size * (sizeof(InT) + sizeof(OutT))); ``` do: ``` state.add_element_count(size, "Elements"); state.add_global_memory_read<InT>(size, "InputSize"); state.add_global_memory_write<InT>(size, "OutputSize"); ``` The string arguments are optional. If provided, a new column will be added to the output with the indicated name and number of bytes (or elements for `add_element_count`).	2021-02-18 15:47:59 -05:00
Allison Vacanti	dcd5d1ffa6	Update markdown output format.	2021-02-18 14:44:17 -05:00
Allison Vacanti	ef3e1594eb	Implement manual timers. See the new thrust/sort/basic.cu benchmark for usage. Other notable changes: - Updated summary column names: - Cold GPU -> GPU Time - Cold CPU -> CPU Time - Hot GPU -> Batch GPU - Removed CPU timings from measure_hot - They'd been hidden for a while, and aren't really useful. - Moved the throughput calcs to measure_cold - `timer` will disable `hot` timings, still want throughput - `cold` timings make more sense for throughput, global BW numbers are meaningless if the data is sitting in L2.	2021-02-17 18:48:26 -05:00
Allison Vacanti	385d4f77ba	Teach markdown_format about sample_sizes.	2021-02-17 18:34:35 -05:00
Allison Vacanti	8a1f017a4e	Inline some methods used in benchmark loops.	2021-02-17 18:34:09 -05:00
Allison Vacanti	f61be70a93	Add initial implementation of exec_tag dispatching. nvbench::exec_tags are used to request measurement types and share information about the kernel. They are used to ensure that templated measurement code is not instantiated unless actually used. Replaces the nvbench::exec(state, launcher, tags) pattern with: state.exec(tags, launcher); state.exec(launcher); // defaults to hot/cold cuda measurements	2021-02-16 23:47:36 -05:00
Allison Vacanti	37e753f7b6	Update benchmark macros: s/NVBENCH_CREATE/NVBENCH_BENCH/g s/NVBENCH_BENCH_TEMPLATE/NVBENCH_BENCH_TYPES/g This will fit nicer once the exec_tags version are added: NVBENCH_BENCH NVBENCH_BENCH_TYPES NVBENCH_BENCH_FLAGS NVBENCH_BENCH_TYPES_FLAGS	2021-02-16 16:08:38 -05:00
Allison Vacanti	d12326083d	Clean up l2flush initialization.	2021-02-16 12:01:50 -05:00
Allison Vacanti	f46dda0e81	Use noexcept CUDA_CALL check in destructor.	2021-02-16 12:00:04 -05:00
Allison Vacanti	55aa78ce17	Make the use of the blocking_kernel optional. This breaks thrust algorithms, which sync internally. I'll need to add an exec_tag to toggle this.	2021-02-15 21:55:26 -05:00
Allison Vacanti	bb871094c3	Fixes for multidevice/gcc. - Allow devices to be cleared during benchmark definition. - Fix various demangling bugs.	2021-02-15 21:26:21 -05:00
Allison Vacanti	8897490a6d	Add cxxabi demangling for gcc/clang.	2021-02-15 21:00:09 -05:00
Allison Vacanti	6c67578dcd	Implement skip_time and improve logging.	2021-02-15 17:39:46 -05:00
Allison Vacanti	ead8392bce	Use NVBENCH_THROW in option_parser.cu.	2021-02-15 17:19:07 -05:00
Allison Vacanti	6cf29b5083	Various small updates and refactorings. - collapse nested namespace specifiers. - Clean up markdown format tables.	2021-02-15 17:18:03 -05:00
Allison Vacanti	d323f569b8	Add termination criteria API. - min_samples - min_time - max_noise - skip_time (not yet implemented) - timeout Refactored s/(trials)\|(iters)/samples/s.	2021-02-15 12:04:15 -05:00
Allison Vacanti	e5914ff620	Clean up blocking_kernel. - Rename release() -> unblock() to avoid confusion with release fences. - Remove some unused headers.	2021-02-14 16:07:22 -05:00
Allison Vacanti	1cea5e1965	Add and use blocking_kernel.	2021-02-13 11:21:30 -05:00
Allison Vacanti	2125ada770	Call cudaDeviceReset from NVBENCH_MAIN.	2021-02-13 10:01:09 -05:00

1 2 3

143 Commits