Files
ik_llama.cpp/examples/server/server-context.cpp
SamuelOliveirads d93dfb5e6b fix: save/restore sampler state during speculative checkpoint
When speculative decoding rejects draft tokens and restores the
recurrent state checkpoint, the sampler (RNG, grammar, prev tokens)
must also be restored to maintain consistency. Without this, the
sampler state reflects the rejected draft tokens, leading to
potential divergence.

Uses common_sampler_clone() to snapshot the sampler before the
speculative batch decode, and restores it on rejection.
2026-04-16 22:36:37 -03:00

172 KiB