common : introduce composable PEG parser combinators for chat parsing and new jinja template engine (#1369)

--------- Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com> common : add nemotron 3 parsing (#18077) common : add parser for ministral/mistral large 3/devstral 2 (#17713) common : default content to an empty string (#18485) chat: make tool description and parameters optional per OpenAI spec (#18478) Per the OpenAI API specification, both 'description' and 'parameters' fields in tool function definitions are optional. Previously, the parser would throw an exception if these fields were missing. Attempts to fix #17667 common : implement new jinja template engine (#18462) --------- Co-authored-by: Alde Rojas <hello@alde.dev> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> jinja: correct member access rule (#18905) jinja : fix lexing of float literals with sign (#18901) jinja : add missing tojson filter for bool (#18900) jinja : attribute support for join, map and sort (#18883) jinja : fix object item order (and properly implement dictsort) (#18904) tests : add test-jinja -py option for cross-checking (#18906) Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> ci : run test-jinja -py on high perf [no ci] (#18916) jinja : fix undefined keys and attributes and int/float as bool (#18924) jinja: support none|string (#18995) Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> jinja : implement mixed type object keys (#18955) --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co> jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests (#19147) `tojson` is not a supported `undefined` filter keep it DRY and fix some types jinja : do not pass empty tools and add some none filters (#19176) jinja : add unordered_map include to value.h [no ci] (#19205) jinja : add missing 'in' test to template engine (#19004) (#19239) The jinja template parser was missing the 'in' test from global_builtins(), causing templates using reject("in", ...), select("in", ...), or 'x is in(y)' to fail with "selectattr: unknown test 'in'". This broke tool-calling for Qwen3-Coder and any other model whose chat template uses the 'in' test. Added test_is_in supporting array, string, and object containment checks, mirroring the existing 'in' operator logic in runtime.cpp. Includes test cases for all three containment types plus reject/select filter usage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Xuan Son Nguyen <son@huggingface.co> Add Jinja support for "indent" string filter (#19529) Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> add vendor refactor chat server : support preserving reasoning_content in assistant message (#18994) chat : fix translategemma crash on common_chat_format_example (#19019) chat: fix language input for translategemma (#19052) Co-authored-by: Aldehir Rojas <hello@alde.dev> --------- Co-authored-by: Aldehir Rojas <hello@alde.dev> chat: fix case where template accepts type content only (#19419) mtmd : chat : Fix extra \n between text and media marker (#19595) Thanks to @tugot17 for detecting and reporting the issue. For vision models (e.g. LFM2.5-VL-1.6B and Qwen/Qwen3-VL-4B-Instruct) `llama-mtmd-cli` produces identical output to HF implementation. However `llama-server` doesn't. I traced it down to extra newline inserted after `<__media__>`. This happens in `to_json_oaicompat`, that treats media markers as text and joins all parts with `\n` separator. PR introduces new type `media_marker` and uses it for media markers. Extra logic is added to prevent insertion of newlines before and after media markers. With this change number of input tokens is identical to HF implementation and as a result the output is also identical. I explored other ways to address the issue * remove completely `\n` between text parts in `to_json_oaicompat` * merge text messages in server-common.cpp before sending them to `to_json_oaicompat` Please propose alternative ways of fixing this issue. Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com> --------- Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com> common : merge qwen3-coder and nemotron nano 3 parsers (#19765) common : fix improper trimming in XML parser on complete message (#19805) Co-authored-by: Jules LEIDELINGER <11395311+julio75012@users.noreply.github.com> jinja: correct stats for tojson and string filters (#19785) jinja : correct default size for string slices (#19913) common : handle unicode during partial json parsing (#16526) common : fix json schema with '\' in literals (#17307) add back qwen_coder_xml and mirothinker Co-authored-by: Aldehir Rojas <hello@alde.dev>
2026-04-29 19:01:47 +00:00 · 2026-03-09 05:03:33 -05:00
parent 542988773c
commit ab1d74074b
71 changed files with 19098 additions and 4751 deletions
--- a/common/chat.h
+++ b/common/chat.h
@@ -3,12 +3,15 @@
 #pragma once

 #include "common.h"
+#include "peg-parser.h"
 #include <functional>
 #include <chrono>
 #include <string>
 #include <vector>
 #include <map>

+#include <nlohmann/json_fwd.hpp>
+
 struct common_chat_templates;

 struct common_chat_tool_call {
@@ -25,6 +28,11 @@ struct common_chat_msg_content_part {
    std::string type;
    std::string text;

+    // TODO @ngxson : no known chat templates support reasoning_content in content parts yet
+    //                this can be useful for models with interleaved thinking (like Kimi-K2)
+    //                if you see any templates explicitly support this, please ping me
+    // std::string reasoning_content;
+
    bool operator==(const common_chat_msg_content_part & other) const {
        return type == other.type && text == other.text;
    }
@@ -39,7 +47,7 @@ struct common_chat_msg {
    std::string tool_name;
    std::string tool_call_id;

-    template <class T> T to_json_oaicompat() const;
+    nlohmann::ordered_json to_json_oaicompat(bool concat_typed_text = false) const;

    bool empty() const {
        return content.empty() && content_parts.empty() && tool_calls.empty() && reasoning_content.empty() && tool_name.empty() && tool_call_id.empty();
@@ -125,6 +133,11 @@ enum common_chat_format {
    COMMON_CHAT_FORMAT_XIAOMI_MIMO,
    COMMON_CHAT_FORMAT_MIROTHINKER,

+    // These are intended to be parsed by the PEG parser
+    COMMON_CHAT_FORMAT_PEG_SIMPLE,
+    COMMON_CHAT_FORMAT_PEG_NATIVE,
+    COMMON_CHAT_FORMAT_PEG_CONSTRUCTED,
+
    COMMON_CHAT_FORMAT_COUNT, // Not a format, just the # formats
 };

@@ -155,6 +168,7 @@ struct common_chat_params {
    std::vector<common_grammar_trigger> grammar_triggers;
    std::vector<std::string>            preserved_tokens;
    std::vector<std::string>            additional_stops;
+    std::string                         parser;
 };

 struct common_chat_syntax {
@@ -164,6 +178,7 @@ struct common_chat_syntax {
    bool                     reasoning_in_content  = false;
    bool                     thinking_forced_open  = false;
    bool                     parse_tool_calls      = true;
+    common_peg_arena         parser                = {};
 };

 // Check if the template supplied via "--chat-template" is supported or not. Returns true if it's valid
@@ -182,8 +197,7 @@ common_chat_templates_ptr common_chat_templates_init(
                                           const std::string & eos_token_override = "");

 bool         common_chat_templates_was_explicit(const struct common_chat_templates * tmpls);
-const char * common_chat_templates_source(const struct common_chat_templates * tmpls, const char * variant = nullptr);
-
+std::string  common_chat_templates_source(const struct common_chat_templates * tmpls, const std::string & variant = "");

 struct common_chat_params      common_chat_templates_apply(
    const struct common_chat_templates * tmpls,
@@ -207,19 +221,22 @@ const char*               common_chat_format_name(common_chat_format format);
 const char*               common_reasoning_format_name(common_reasoning_format format);
 common_reasoning_format   common_reasoning_format_from_name(const std::string& format);
 common_chat_msg           common_chat_parse(const std::string & input, bool is_partial, const common_chat_syntax & syntax);
+common_chat_msg           common_chat_peg_parse(const common_peg_arena & parser, const std::string & input, bool is_partial, const common_chat_syntax & syntax);

 common_chat_tool_choice common_chat_tool_choice_parse_oaicompat(const std::string & tool_choice);

 bool common_chat_templates_support_enable_thinking(const common_chat_templates * chat_templates);

 // Parses a JSON array of messages in OpenAI's chat completion API format.
-// T can be std::string containing JSON or nlohmann::ordered_json
-template <class T> std::vector<common_chat_msg> common_chat_msgs_parse_oaicompat(const T & messages);
-template <class T> T common_chat_msgs_to_json_oaicompat(const std::vector<common_chat_msg> & msgs, bool concat_typed_text = false);
+std::vector<common_chat_msg> common_chat_msgs_parse_oaicompat(const nlohmann::ordered_json & messages);

-// Parses a JSON array of tools in OpenAI's chat completion tool call API format.
-// T can be std::string containing JSON or nlohmann::ordered_json
-template <class T> std::vector<common_chat_tool> common_chat_tools_parse_oaicompat(const T & tools);
-template <class T> T common_chat_tools_to_json_oaicompat(const std::vector<common_chat_tool> & tools);
+// DEPRECATED: only used in tests
+nlohmann::ordered_json common_chat_msgs_to_json_oaicompat(const std::vector<common_chat_msg> & msgs, bool concat_typed_text = false);

-template <class T> T common_chat_msg_diff_to_json_oaicompat(const common_chat_msg_diff & diff);
+std::vector<common_chat_tool> common_chat_tools_parse_oaicompat(const nlohmann::ordered_json & tools);
+nlohmann::ordered_json common_chat_tools_to_json_oaicompat(const std::vector<common_chat_tool> & tools);
+
+nlohmann::ordered_json common_chat_msg_diff_to_json_oaicompat(const common_chat_msg_diff & diff);
+
+// get template caps, useful for reporting to server /props endpoint
+std::map<std::string, bool> common_chat_templates_get_caps(const common_chat_templates * chat_templates);