Files
ik_llama.cpp/common
firecoperana c1931663ad server: improve speed of speculative decoding (#1119)
* server: improve speed of speculative decoding

change logs

rpc: add recompute

spec dec fix

* Fix n_batch_size not set to context size for draft model

---------

Co-authored-by: firecoperana <firecoperana>
2026-01-10 08:01:22 +02:00
..
2024-07-27 07:55:01 +02:00
2026-01-05 07:57:25 +02:00
2025-12-15 08:27:20 +01:00
2025-11-30 18:45:38 +01:00
2025-12-15 08:27:20 +01:00
2025-12-16 18:12:16 +01:00
2024-07-27 07:55:01 +02:00
2023-11-13 14:16:23 +02:00