Commit Graph

19 Commits

Author SHA1 Message Date
turboderp
8877b99855 QCache: Skip dequant when possible outside of SWA window 2026-04-13 00:52:46 +02:00
turboderp
87830de104 Bump to v0.0.29 2026-04-12 02:27:23 +02:00
turboderp
6b8225ff41 Add bighead-attn and bighead-attn-paged kernels 2026-04-12 01:34:54 +02:00
lesj0610
69d6c6ad76 feat(gemma4): add multimodal generation support 2026-04-03 12:52:51 +09:00
lesj0610
f4bec2eb2e feat(gemma4): add Gemma4 architecture support 2026-04-03 09:49:12 +09:00
turboderp
1592d04ffd Tests: Fix up generator stress test 2026-03-22 18:16:33 +01:00
turboderp
15647d98d7 Sampling: Fix possible divide-by-zero in rep.penalty kernels 2026-03-22 18:02:48 +01:00
lesj0610
88062566f5 Qwen3.5: Smoke test 2026-03-02 15:49:29 +01:00
turboderp
0f2da5d6a7 GEMM: Lock MCG multiplier to 0xCBAC1FED and MUL1 to 0x83DCD12D. Make MCG the default codebook for new models. 2025-10-12 22:09:01 +02:00
turboderp
12eadfe114 Generator: Add requeue option 2025-09-22 03:34:31 +02:00
turboderp
1ff09ee3c4 Generator: Fix batching with recurrent states 2025-09-22 03:30:37 +02:00
turboderp
9be91d644d Generator: Periodically defragment paged cache 2025-05-28 22:23:08 +02:00
turboderp
965f5b2f0a Add cache_rotate kernel 2025-05-27 11:27:51 +02:00
turboderp
c82af98d57 New RoPE kernel with fused head norm 2025-05-25 13:33:39 +02:00
turboderp
d359bcc0d3 Add MCG 3INST and MCG 1MAD (MUL1) experimental quant modes 2025-05-21 19:15:13 +02:00
turboderp
40863d0690 Add repetition, presence and frequency penalties 2025-04-27 01:09:33 +02:00
turboderp
702ebab361 Add sampler test cases 2025-04-25 00:48:06 +02:00
turboderp
cf84811485 Add cache quantization 2025-04-22 21:52:33 +02:00
turboderp
543c4b2771 Initial commit 2025-04-06 14:42:49 +02:00