Files
composable_kernel/example
Gino Lu eca3cb3e0a sparse_attn: add bm0 dispatch for sparge blockmap compatibility
Add bm0 field to fmha_jenga_fwd_traits so callers can specify the
preferred Q-tile size. Codegen now emits separate tile configs for
bm0=64 (sparge blockmap) and bm0=128 (original), with CppConstraint
guards to select the right kernel at runtime.

End-to-end test passes for both jenga and vsa paths. Performance is
known to be suboptimal at this stage; tile sizes and warp counts for
the bm0=64 path have not been tuned.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 05:13:51 -04:00
..
2026-01-14 07:31:45 -08:00