Files
ik_llama.cpp/examples
firecoperana bacb8fb79f Server: Handle context shift better to reduce prompt processing time (#973)
* Handle context shift better to reduce pp

Add context-shift args

Add back ga_n in context shift

* optimize discard function and bring back n_keep = -1

---------

Co-authored-by: firecoperana <firecoperana>
2025-11-19 16:04:48 +01:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-05-23 08:07:42 +03:00
2025-11-14 06:58:19 +02:00
2025-06-19 10:24:53 +03:00
2025-06-19 10:24:53 +03:00
2025-06-19 10:24:53 +03:00
2024-07-27 07:55:01 +02:00
2025-11-14 06:59:54 +02:00
2024-08-12 15:14:32 +02:00
2024-08-12 15:14:32 +02:00
2023-03-29 20:21:09 +03:00
2024-07-27 07:55:01 +02:00