Files
composable_kernel/example
Gino Lu 840b8a37d9 test(sparse_attn): CPU-ref cross-check + BLKQ cite
Wire SpargeAttn CPU reference into test_sparge: build the block_map on host via
sparge::build_block_map_meansim and cross-check against the GPU-produced map;
self-check the VSA delta-LUT (valid count + reachable kb indices); split PASS/FAIL
into separate block_map / LUT / attention-output lines for clearer diagnosis.

Set sparge_tool::SpargeParams::BLKQ default to 64 to match SpargeAttn SM90
convention (cite upstream qk_int_sv_f8_cuda_sm90.cu:143-144); tighten bf16
tolerance back to the dense FMHA baseline (4e-2 atol, 1e-2 rtol).

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
2026-05-17 02:35:51 -04:00
..
2026-01-14 07:31:45 -08:00