Adding top-n-sigma sampler (#489)

* Adding top-n-sigma sampler * Fix typos in XTC PR * Update README.md for main and server * More README * More README --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2026-03-13 07:20:15 +00:00 · 2025-06-03 17:35:09 +03:00
parent ccb265c016
commit f6d5fbdc57
9 changed files with 115 additions and 11 deletions
--- a/include/llama.h
+++ b/include/llama.h
@@ -1216,6 +1216,13 @@ extern "C" {
                           float   threshold,
                           size_t  min_keep);

+    /// @details Top n sigma sampling as described in academic paper "Top-nσ: Not All Logits Are You Need" https://arxiv.org/pdf/2411.07641
+    LLAMA_API void llama_sample_top_n_sigma(
+            struct llama_context * ctx,
+          llama_token_data_array * candidates_p,
+                           float   top_n_sigma);
+
+
    /// @details Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
    /// @param candidates A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
    /// @param tau  The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.