CUTLASS example added, license headers added, fixes

- Add license header to each example file.
- Fixed broken runs caused by type declarations.
- Fixed hang in throughput.py when --run-once by doing a
  manual warm-up step, like in auto_throughput.py
This commit is contained in:
Oleksandr Pavlyk
2025-07-24 09:33:13 -05:00
parent c136efab65
commit a69a3647b2
10 changed files with 226 additions and 3 deletions

View File

@@ -0,0 +1,7 @@
numpy
numba
cupy
nvidia-cutlass
cuda-cccl
cuda-core
cuda-bindings