Jeff Huang
0aa1142bb5
[CK] Add FP8 KV_BLOCKSCALE support for batch prefill
...
Implement per-page K/V quantization for paged attention:
- Add KV_BLOCKSCALE enum to BlockAttentionQuantScaleEnum
- Use exp2 shift trick to eliminate explicit P scaling overhead
- Prefetch physical pages offset for KV cache, overlaps with computations
2026-02-03 11:18:14 +08:00
..
2026-01-26 12:57:09 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2025-10-16 03:10:57 -07:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-15 16:43:02 +01:00
2026-01-13 07:14:23 +01:00
2026-01-07 16:30:57 +01:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-17 08:30:27 +01:00
2026-01-20 09:39:57 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-17 08:30:27 +01:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-20 13:06:59 -08:00
2025-12-30 16:25:08 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-02-02 13:58:11 -08:00
2026-01-29 10:29:40 -08:00
2025-11-28 13:49:54 -08:00
2026-01-30 17:02:14 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-15 16:43:02 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-15 16:43:02 +01:00
2026-01-15 16:43:02 +01:00
2026-02-03 11:18:14 +08:00
2026-01-14 07:31:45 -08:00
2024-12-04 00:46:47 +01:00