Files
composable_kernel/example
Ding, Yi ce838e19e5 [CK_TILE] FMHA BWD launcher: address PR #7331 review comments (round 2)
- prepare_workspace_async: allocate pinned host staging before enqueuing
  the dq_acc memset. If pinned_host_alloc throws, no stream work has
  been issued yet, so the workspace is left cleanly un-prepared rather
  than half-initialized.
- pack_workspace_host catch: note that the H2D queued after the
  callback will copy indeterminate metadata if the catch fires (kernel
  will produce wrong results); unlikely since pack only throws on
  precondition violations.
- schedule_pin_staging_release: std::move pin_staging_ into the heap
  shared_ptr; the next line in prepare_workspace_async overwrites it,
  so the extra atomic inc/dec from a copy is wasted.
2026-05-13 02:52:49 -04:00
..
2026-01-14 07:31:45 -08:00