Allison Vacanti ef3e1594eb Implement manual timers.
See the new thrust/sort/basic.cu benchmark for usage.

Other notable changes:

- Updated summary column names:
  - Cold GPU -> GPU Time
  - Cold CPU -> CPU Time
  - Hot GPU  -> Batch GPU
- Removed CPU timings from measure_hot
  - They'd been hidden for a while, and aren't really useful.
- Moved the throughput calcs to measure_cold
  - `timer` will disable `hot` timings, still want throughput
  - `cold` timings make more sense for throughput, global BW numbers
    are meaningless if the data is sitting in L2.
2021-02-17 18:48:26 -05:00
2021-02-17 18:48:26 -05:00
2021-02-16 16:08:38 -05:00
Description
Languages
Cuda 62.6%
C++ 15%
Python 11.8%
Shell 5.4%
CMake 5.2%