Files
composable_kernel/include/ck_tile/ops
msaffari-amd 403d99124d [AITERKER-112] Add PER_TOKEN_HEAD FP8 quant scheme to batch_prefill
- New BlockAttentionQuantScaleEnum::PER_TOKEN_HEAD enum value
- Pipeline overload in block_fmha_batch_prefill_pipeline_qr_ks_vs_async
  applying per-token Q/K descale via GEMM0-post outer product and
  per-head V descale at epilogue
- fmha_batch_prefill_kernel kargs + MakeKargs + pipeline dispatch
- fmha_fwd.hpp host-side traits/args wiring
- quant.hpp trait specialization
- Codegen emits PER_TOKEN_HEAD kernel variants
2026-05-19 15:41:32 +00:00
..
2026-01-13 09:21:29 -08:00