mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 02:02:46 +00:00
The kSkipMinSeqlenQ optimization incorrectly used <= comparison, causing the kernel to skip batches where seqlen_q equals min_seqlen_q. This happens in the common case of no padding (all batches have the same seqlen_q == min_seqlen_q), producing all-zero output silently. Changed to strict < so batches with exactly min_seqlen_q tokens are still processed. Made-with: Cursor