mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-04-20 06:48:59 +00:00
Update README.md
This commit is contained in:
@@ -143,7 +143,7 @@ To get started quickly - please refer :
|
||||
- Fix a few bugs in distributed gemm API and examples.
|
||||
- Fix handling negative zero in sparse compressor.
|
||||
- Add missing `wait_on_dependent_grids` for PDL use case.
|
||||
- Work around a driver bug which will cause occasionally errors when executing kernels.
|
||||
- Work around a driver TMA descriptor related bug which will cause occasionally errors on Blackwell when the tensor's backing memory allocation is less than 128KB and it is not a dense non-overlapping tensor.
|
||||
* Fix some profiler issues:
|
||||
- Add some missing reference kernels.
|
||||
- Support VoidC reference kernels.
|
||||
|
||||
Reference in New Issue
Block a user