[CK_TILE] FMHA FWD bug fix (#2888)

* tempsave debug

* fix the bug in fmha fwd_kernel

* Remove unnecessary changes

* Fix the buggy part

* remove fmha fwd known failure cases
This commit is contained in:
Haocong WANG
2025-09-23 15:00:46 +08:00
committed by GitHub
parent ad259eeae2
commit b6e8994386
3 changed files with 24 additions and 20 deletions

View File

@@ -37,6 +37,7 @@ struct BlockFmhaPipelineQRKSVSAsyncTrload
using VLayout = remove_cvref_t<typename BlockFmhaShape::VLayout>;
static constexpr bool kQLoadOnce = true; // if q_tile load whole block length (hdim) at once
static_assert(kQLoadOnce == Policy::QLoadOnce);
static constexpr bool kKLoadOnce = BlockFmhaShape::kM0 >= 64;
static constexpr index_t kBlockSize = Problem::kBlockSize;