[CK_TILE] Add graph capture support for FMHA backward(new
branch) (#8262)
## Motivation
Add HIP graph capture support for FMHA backward operations. The original
implementation only supported normal execution mode and would cause
use-after-free crashes when used with graph capture replay.
When FMHA backward is captured into a HIP graph:
- First replay: host callback executes and deletes the closure (as
designed for normal mode)
- Subsequent replays: use-after-free crash because the closure was
already freed
This PR enables `fmha_bwd_launcher::prepare_workspace_async()` to work
correctly in both normal execution and graph capture modes.