msaffari-amd
ee3ada6e4a
[AITERKER-112] PER_TOKEN_HEAD: support page_size < kN0 via cross-page dequant
...
- Pipeline: remove kPageBlockSize >= kN0 static_assert; QK dequant now
precomputes tile_k_pages[] and indexes per-column. page_size >= kN0 stays
on the original single-page fast path (kPagesPerTile==1).
- Codegen: add page_size=64 to SUPPORTED_PAGE_SIZE; drop per_token_head from
the page_size < tile.F_bn0 filter (kv_blockscale still filtered).
2026-05-20 14:21:12 +00:00
..
2025-11-26 11:00:05 -07:00
2025-12-02 13:30:27 +01:00
2025-11-26 11:00:05 -07:00
2026-04-14 17:51:20 +00:00
2026-04-14 07:45:14 +00:00
2026-04-14 20:23:26 +00:00
2026-04-10 15:18:02 +00:00
2026-05-20 14:21:12 +00:00
2026-04-23 22:45:32 +00:00
2026-04-30 18:33:36 +00:00
2026-03-27 20:37:23 +00:00
2026-04-07 14:38:07 +00:00
2026-03-31 08:03:41 +00:00
2025-11-26 11:00:05 -07:00
2025-11-26 11:00:05 -07:00
2026-04-10 15:18:02 +00:00
2025-11-26 11:00:05 -07:00
2026-01-13 09:21:29 -08:00
2026-04-10 15:18:02 +00:00
2026-01-30 10:52:19 +08:00
2026-04-30 18:33:36 +00:00
2025-11-26 11:00:05 -07:00
2026-02-11 05:52:42 +00:00
2026-01-31 00:59:47 +08:00
2025-11-26 11:00:05 -07:00
2026-01-13 09:21:29 -08:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-04-14 20:23:26 +00:00
2026-03-02 12:21:44 +00:00
2026-04-24 16:31:59 +00:00
2026-03-02 12:21:44 +00:00
2026-03-12 08:27:49 +00:00
2026-03-16 08:31:56 +00:00
2026-03-16 08:31:56 +00:00
2026-03-12 08:27:49 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2025-11-26 11:00:05 -07:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-04-30 18:33:36 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00
2026-03-02 12:21:44 +00:00