composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-04-19 22:39:03 +00:00

Files

Jeff Huang 7c6430eca0 [CK_TILE] fmha: Add query padding support to backward pass (#3097 )

* [CK_TILE] fmha: Add query padding support to backward pass

Introduces support for query sequence padding (q_padding) in the FMHA backward pass kernels.
- Passing `seqlen_q_ptr` to the backward kernels to distinguish logical from physical sequence lengths.
- Updating `OGradDotO`, `ConvertQGrad`, and `DQDKDV` kernels to respect logical lengths and handle zero-length sequences.
- Aligning LSE indexing in the forward kernel with the padded layout for consistency.
- Adding a new GTest suite (`test_fmha_bwd_kernel_padding.cpp`) with comprehensive tests for various padding scenarios, including zero-length
sequences and deterministic mode.

* fix clang format

* Adapt fmha_bwd_runner.cpp to new q, kv sequence padding
Add backward q/kv sequence padding unit tests.

* [CK_TILE] fmha: Unify sequence length and padding handling

Refactor the handling of sequence lengths and padding in the
FMHA forward and backward kernels to provide a more unified and flexible
interface.

- Replaced `seqstart_padded_*_ptr` with a more robust system that uses
`seqstart_*_ptr` for physical sequence lengths and introduces
`seqlen_*_ptr` and `cu_seqlen_*_ptr` for logical (unpadded) lengths.
- Established a clear order of precedence for determining sequence
length: cumulative lengths (`cu_seqlen_*_ptr`) take priority,
followed by per-sequence lengths (`seqlen_*_ptr`), and finally
physical lengths derived from `seqstart_*_ptr`.
- Clarified the distinction between "group mode" and "batch mode" and
how sequence lengths are handled in each case.
- Renamed `cu_seqlen_kv_ptr` to `cu_seqlen_k_ptr` for consistency.
- Updated comments and documentation to reflect the new argument
structure and usage.

---------

Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>

2025-10-29 13:56:11 +08:00

block

[CK_TILE] Support f32 in FMHA (fwd and bwd) (#2836 )

2025-09-27 18:03:48 +05:00

kernel

[CK_TILE] fmha: Add query padding support to backward pass (#3097 )

2025-10-29 13:56:11 +08:00

pipeline

[CKTILE] FMHA fwd trload lse fix (#3046 )

2025-10-23 09:33:33 +08:00