mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
Mark the FmhaBwdWorkspaceManager size/offset accessors as CK_TILE_HOST (they are only invoked from host-side workspace setup), and pad GetWorkspaceHostSize up to a 4K boundary so the GPU dq_acc buffer always starts on a page-aligned offset.