Default Branch

949bb8f1d6 · More MTP tweaks (#1792) · Updated 2026-05-13 14:55:43 +00:00

Branches

62755c24e8 · Fix ggml_nbytes · Updated 2026-05-13 14:28:06 +00:00

4
1

67735a4587 · More MTP tweaks · Updated 2026-05-13 09:37:26 +00:00

4
1

26591f2b57 · Cleanup · Updated 2026-05-13 05:10:41 +00:00

9
2

7ff12d64c0 · server: reset cache tokens after pp stops · Updated 2026-05-13 00:30:34 +00:00

9
1

68f36e7878 · Gemma4 MTP: avoid casting KV cache to f32 · Updated 2026-05-12 11:41:48 +00:00

9
1

1d9d2b7f1b · Fix GLM-4.5 MTP loading · Updated 2026-05-12 06:56:15 +00:00

9
1

7f29d4a670 · rebase branch with main · Updated 2026-05-12 01:53:33 +00:00

13
1

16369dbf0f · MTP: Reuse graphs (again) · Updated 2026-05-11 15:31:15 +00:00

13
1

b28ddd49e3 · Cleanup · Updated 2026-05-11 12:22:50 +00:00

14
4

54262626b7 · Avoid recurrent state copy · Updated 2026-05-11 09:43:22 +00:00

15
1

d81090541b · MTP: ebable per step recurrent state for split mode graph · Updated 2026-05-10 12:47:30 +00:00

18
1

e7f8d7cdbd · Fix Mistral3 split mode graph · Updated 2026-05-10 05:46:40 +00:00

18
1

f6deca0f97 · Faster per step recurrent state restore when using MTP · Updated 2026-05-09 13:31:03 +00:00

20
1

43df4192d6 · Avoid some code duplication · Updated 2026-05-08 13:46:10 +00:00

25
2

010da571be · Use async copies to save/restore recurrent state · Updated 2026-05-08 13:04:00 +00:00

25
1

d0c4dd6c55 · Fix discarding tokens from the KV cache during MTP drafting · Updated 2026-05-08 04:51:59 +00:00

26
1

9e05954460 · server: fix mtmd checkpoint restore and avoid checkpoint host copies · Updated 2026-05-06 00:57:43 +00:00

32
1

2a2ea4c9df · MTP tweaks · Updated 2026-05-05 16:16:18 +00:00

32
1

710dc0879a · Cleanup · Updated 2026-05-04 12:37:27 +00:00

38
2

b2c9fd1524 · Minor MTP improvement · Updated 2026-05-04 05:52:04 +00:00

38
1