Instead of:
```
state.set_element_count(size);
state.set_global_memory_bytes_accessed(
size * (sizeof(InT) + sizeof(OutT)));
```
do:
```
state.add_element_count(size, "Elements");
state.add_global_memory_read<InT>(size, "InputSize");
state.add_global_memory_write<InT>(size, "OutputSize");
```
The string arguments are optional. If provided, a new column will
be added to the output with the indicated name and number
of bytes (or elements for `add_element_count`).