dungquixote42
|
6695c6c945
|
Implement Adaptive-P Sampler (#1100)
* initial implementation of adaptive-p sampler
* explicitly mark candidates unsorted + cleanup qualifiers
* cosmetic update
* reorg prototypes
* lockstep with mainline
* add _impl for _init + reorg
* add LLAMA_API to prototypes
* update sharpness to 10
* lockstep: rng seed
* delete llama_sampling member in llama_sampler_adaptive_p
* fix LLAMA_API return type
* lockstep: rng seed cont
* actually correct implementation
* lockstep: sorting behavior
* const -> constexpr for known constants
* add missing space
* fix softmax usage in adaptive p sampler
* cosmetic changes
* implement do-not-sort version of softmax
* simpify rng seed, add static to constexpr
* refactor: remove iface + use shared rng + use actually original probabilities
* adaptive-p: add dedicated rng back in
* fix initial max_logit + add float vector to adaptive p sampler context + stochastic sampling
* adaptive-p: fuse first softmax with transformation
* adaptive-p: implement binary search selection
* adaptive-p: update comment
|
2026-01-10 07:58:53 +02:00 |
|
firecoperana
|
090f354d33
|
Refactor chat and server file (#1062)
* Add alternative log functions
* chat: fix int overflow, prevent size calculation in float/double (#17357)
* chat: fix int overflow, prevent size calculation in float/double
* Update common/chat.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* common : move all common_chat_parse_* to chat-parser.cpp. (#17481)
# Conflicts:
# common/chat.cpp
* server: split server.cpp code into server/common/task/queue/context
* Fix compiler warning
* Clean up code
* common: use native MultiByteToWideChar
* move server prompt to server task
* Clean code
* delete utils.hpp
---------
Co-authored-by: firecoperana <firecoperana>
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: DAN™ <dranger003@gmail.com>
|
2025-12-15 08:27:20 +01:00 |
|