hksdpc255
|
e1c4c4a495
|
Fix Anthropic Messages API (#1136)
* server: stop processing the prompt when client disconnects
implement generator-based API for task results
Update httplib.h to 0.27.0
Fix embedding error
Stop prompt processing when disconnected
* Port upstream https://github.com/ggml-org/llama.cpp/pull/18551
* add back anthropic
* Fix merge issue caused by github webui
---------
Co-authored-by: firecoperana <firecoperana>
|
2026-01-13 08:37:29 +02:00 |
|
firecoperana
|
1a461525d5
|
server: stop processing the prompt when client disconnects (#1134)
implement generator-based API for task results
Update httplib.h to 0.27.0
Fix embedding error
Stop prompt processing when disconnected
Co-authored-by: firecoperana <firecoperana>
|
2026-01-13 07:56:59 +02:00 |
|
firecoperana
|
2a633c4357
|
server: exclude thinking tokens when finding the slot (#1079)
refactor find slot
enable by default
Fix load prompt
rename variables
Co-authored-by: firecoperana <firecoperana>
|
2025-12-22 09:46:45 +01:00 |
|
firecoperana
|
0e91b89cd3
|
Refactor chat and server file (#1062)
* Add alternative log functions
* chat: fix int overflow, prevent size calculation in float/double (#17357)
* chat: fix int overflow, prevent size calculation in float/double
* Update common/chat.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* common : move all common_chat_parse_* to chat-parser.cpp. (#17481)
# Conflicts:
# common/chat.cpp
* server: split server.cpp code into server/common/task/queue/context
* Fix compiler warning
* Clean up code
* common: use native MultiByteToWideChar
* move server prompt to server task
* Clean code
* delete utils.hpp
---------
Co-authored-by: firecoperana <firecoperana>
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: DAN™ <dranger003@gmail.com>
|
2025-12-15 08:27:20 +01:00 |
|