Files
cutlass/include
Blake Ledden 7a9fe055cb fix: Add missing kElementsPerAccess division in RegularTileIterator store (#3049)
The store(frag, tile_offset) method was computing the pointer offset
without dividing by kElementsPerAccess, while the matching load(frag,
tile_offset) method does include this division. Both load_with_pointer_offset
and store_with_pointer_offset apply the same byte conversion, so the
tile_offset -> pointer_offset calculation must also match.

When kElementsPerAccess > 1, this caused load and store to reference
different memory locations for the same logical tile offset.

Fixes #3017

Signed-off-by: Blake Ledden <bledden@users.noreply.github.com>
2026-04-24 23:27:40 -04:00
..