mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-05-11 08:50:09 +00:00
cutlass 3.9 update (#2255)
* cutlass 3.9 update * rebase * fixes out of shared memory for blockwise Blackwell * doc format * fix issue 2253 * disable host ref by default * fix sm120 smem capacity --------- Co-authored-by: yuzhai <yuzhai@nvidia.com> Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
This commit is contained in:
@@ -34,7 +34,7 @@
|
||||
addressable memory, and then store it back into addressable memory.
|
||||
|
||||
TileIterator is a core concept in CUTLASS that enables efficient loading and storing of data to
|
||||
and from addressable memory. The PredicateTileIterator accepts a ThreadMap type, which defines
|
||||
and from addressable memory. The PredicatedTileIterator accepts a ThreadMap type, which defines
|
||||
the mapping of threads to a "tile" in memory. This separation of concerns enables user-defined
|
||||
thread mappings to be specified.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user