Files
composable_kernel/example/ck_tile
joyeamd b1afc03c6a fmha hdim256 vectorize improve (#2086)
For hdim 256, will not have vectorized buffer load when seqlen % 256 != 0 and hdim % 256 = 0; this commit tries to solve this condition.

[ROCm/composable_kernel commit: 94d47b1680]
2025-04-16 09:21:04 +08:00
..
2024-10-26 23:52:49 +08:00
2025-01-22 17:34:27 +08:00
2024-04-15 19:27:12 -05:00