Files
composable_kernel/include
Ding, Yi 3607588ca4 FMHA BWD workspace: 4K-align dq_acc base
Mark the FmhaBwdWorkspaceManager size/offset accessors as CK_TILE_HOST
(they are only invoked from host-side workspace setup), and pad
GetWorkspaceHostSize up to a 4K boundary so the GPU dq_acc buffer always
starts on a page-aligned offset.
2026-04-22 02:08:34 -05:00
..