64 lanes of one wave emit independent byte addresses; the memory unit collapses them into the
minimum number of 64 B HBM cache lines. Each column below is one cache line; each row is one lane.
A coloured cell means "lane L's request fell inside cache line C".
Number of non-empty columns = number of HBM transactions.
useful byte (lane wanted this part of the line)fetched but unused (same line, different lane's remainder)empty