mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-05 22:10:10 +00:00
* server: improve speed of speculative decoding change logs rpc: add recompute spec dec fix * Fix n_batch_size not set to context size for draft model --------- Co-authored-by: firecoperana <firecoperana>