mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-05-12 00:50:22 +00:00
Wrap the two slot-level sample/accept call sites in try/catch (std::exception). On exception: log, send_error to the task, release the slot, continue serving. Matches the existing try/catch around common_sampler_init in the same file. Without this, llama_grammar_accept_token throwing "Unexpected empty grammar stack after accepting piece: <pad> (0)" (reproducible on Gemma 4 + json_schema + ctx_shift, see #1725) unwinds out of update_slots -> queue start_loop -> main, hits std::terminate, and aborts the whole server process.
182 KiB
182 KiB