ik_llama.cpp/ggml/include/ggml-rpc.h at c1931663adc990e4ae17f1d2a0f99b1881e64880

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-27 18:01:45 +00:00

Files

firecoperana c1931663ad server: improve speed of speculative decoding (#1119 )

* server: improve speed of speculative decoding

change logs

rpc: add recompute

spec dec fix

* Fix n_batch_size not set to context size for draft model

---------

Co-authored-by: firecoperana <firecoperana>

2026-01-10 08:01:22 +02:00

976 B

Raw Blame History

View Raw

976 B Raw Blame History

976 B

Raw Blame History