mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-01-26 17:20:01 +00:00
* Implement function calling / tools for ik_llama.cpp for Kimi K2
* Implement basic tool choice
* Backport llama.cpp tool calls support
* Enhance function calls with improved chat parser and string utilities
- Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling
- Improve function calls parsing with fallback to llama.cpp builder pattern
- Add string utility functions (starts_with, ends_with, find_partial_stop)
- Update README with function calls testing instructions
- Enhance Kimi K2 parser and function calls documentation
- Add comprehensive test suite for function calls
- Update CMakeLists.txt and Makefile for new components
* Enhance function calling with unified streaming and parser improvements
- Fix streaming content cleanup to prevent function syntax in output
- Unify content extraction patterns with llama.cpp approach
- Improve Kimi K2 parser robustness and partial content handling
- Add comprehensive test coverage for function call scenarios
- Optimize chat message parsing and diff computation
* Replace hardcoded values in kimi_k2_parser.hpp with named constants
- Add compile-time constants for all token format markers
- Add compile-time constants for XML format markers
- Add compile-time constants for simple format patterns
- Replace all hardcoded string literals with named constants
- Use compile-time length calculation to avoid manual counting
- Improve maintainability and reduce magic numbers throughout parser
* Fix duplicate common_chat_parse definition
- Remove duplicate implementation from chat-parser.cpp
- Keep single implementation in chat.cpp following llama.cpp patterns
- Resolves linker error: multiple definition of common_chat_parse
* Fix JSON assertion failure in function call parsing
- Add proper validation that 'function' field is an object before accessing nested keys
- Handle missing 'arguments' field gracefully with default "{}"
- Prevents crash when parsing malformed tool call JSON structures
* Add comprehensive Qwen3 XML tool calling support with unit tests
- Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format
- Add model detection and routing for Qwen3 vs Kimi-K2 formats
- Create 8 comprehensive unit tests covering parsing, streaming, error handling
- Fix token format cleaning bug in kimi_k2_parser.hpp processing order
- Remove progressive parsing code and related utilities
- Add tool injection support for Qwen3 format in server utils
* Add DeepSeek R1 function calling support with comprehensive unit tests
- Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp
- Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp
- Update function_calls.hpp with DeepSeek R1 integration and content extraction
- Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models
- Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration
- Port exact implementation patterns from original llama.cpp for compatibility
Key features:
- Native DeepSeek R1 format: <|tool▁calls▁begin|>function<|tool▁sep|>name```json{}```<|tool▁call▁end|><|tool▁calls▁end|>
- Reasoning content extraction from <think>...</think> tags
- Multiple tool calls support with separate call blocks
- Model detection for deepseek-r1, deepseek_r1 naming patterns
- Integration with incremental parsing and streaming support
* Add partial parsing support for JSON and regex
- json-partial.h/cpp: JSON partial parsing functionality
- regex-partial.h/cpp: Regex partial parsing functionality
* Add format_chat integration tests for Qwen3 tool injection
- Add test_qwen3_format_chat_integration() to validate tool injection pipeline
- Test tool injection conditions and system message enhancement
- Verify JSON formatting and anti-preamble instructions
- Add comprehensive test documentation
Tests confirm tool injection works correctly - conversational preamble
issue is not in ik_llama.cpp but likely in UI configuration.
* Fix Qwen3 tool call parsing - pass model name to parser
Server was not passing model name to parse_chat_message_incremental(),
causing Qwen3 to fall back to Kimi-K2 parser and return tool calls
as content instead of proper tool_calls array.
* Fix non-streaming path to use model-specific parsing
Non-streaming responses were hardcoded to use Kimi-K2 format,
causing Qwen3 XML tool calls to be returned as content instead
of proper tool_calls array. Now uses same model detection as
streaming path for consistency.
* Update Qwen3 function call handling in server and tests
- Enhanced server function call detection and response formatting
- Improved test coverage for Qwen3 tool call scenarios
- Refined XML parsing for better tool execution support
* Add DeepSeek-R1 function call parsing support
Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats:
- Format 1: Standard function call syntax (already supported)
- Format 2: Alternative function call patterns (already supported)
- Format 3: Tools array format - function\n```json\n{"tools": [...]}
- Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call>
Key changes:
- Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern
- Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns
- Integrated both parsers into exception handling chain for robust fallback
- Added comprehensive TDD test coverage for all formats
- Anonymized all confidential information while preserving functionality
Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls
but server failed to parse them correctly.
* Update function_calls.md documentation for DeepSeek-R1 Format 4
- Added Format 4 (XML wrapped) documentation with examples
- Updated implementation notes with correct parser order (3→4→1→2)
- Marked all DeepSeek-R1 formats as working (July 2025 update)
- Updated test status for Format 3 and 4 as passing
- Added parse_deepseek_r1_xml_wrapped() function reference
- Corrected implementation file line numbers
* Fix merge conflict in test-function-calls.cpp
- Removed incomplete merge conflict marker from line 3027
- Ensured all tests compile and pass successfully
- All DeepSeek-R1 formats (1-4) working correctly
- All streaming and content cleaning tests passing
136 lines
4.4 KiB
C++
136 lines
4.4 KiB
C++
// Chat parser with builder pattern for incremental parsing
|
|
#pragma once
|
|
|
|
#include "chat.h"
|
|
#include "json-partial.h"
|
|
#include "regex-partial.h"
|
|
#include <optional>
|
|
#include <string>
|
|
#include <vector>
|
|
|
|
using json = nlohmann::ordered_json;
|
|
|
|
class common_chat_msg_parser {
|
|
std::string input_;
|
|
bool is_partial_;
|
|
common_chat_syntax syntax_;
|
|
std::string healing_marker_;
|
|
|
|
size_t pos_ = 0;
|
|
common_chat_msg result_;
|
|
|
|
public:
|
|
struct find_regex_result {
|
|
std::string prelude;
|
|
std::vector<common_string_range> groups;
|
|
};
|
|
|
|
common_chat_msg_parser(const std::string & input, bool is_partial, const common_chat_syntax & syntax);
|
|
|
|
// Accessors
|
|
const std::string & input() const { return input_; }
|
|
size_t pos() const { return pos_; }
|
|
const std::string & healing_marker() const { return healing_marker_; }
|
|
const bool & is_partial() const { return is_partial_; }
|
|
const common_chat_msg & result() const { return result_; }
|
|
const common_chat_syntax & syntax() const { return syntax_; }
|
|
|
|
// Position manipulation
|
|
void move_to(size_t pos) {
|
|
if (pos > input_.size()) {
|
|
throw std::runtime_error("Invalid position!");
|
|
}
|
|
pos_ = pos;
|
|
}
|
|
|
|
void move_back(size_t n) {
|
|
if (pos_ < n) {
|
|
throw std::runtime_error("Can't move back that far!");
|
|
}
|
|
pos_ -= n;
|
|
}
|
|
|
|
// Get the substring of the input at the given range
|
|
std::string str(const common_string_range & rng) const;
|
|
|
|
// Content manipulation
|
|
void add_content(const std::string & content);
|
|
void add_reasoning_content(const std::string & reasoning_content);
|
|
|
|
// Tool call manipulation
|
|
void add_tool_call(const common_chat_tool_call & tool_call);
|
|
bool add_tool_call(const std::string & name, const std::string & id, const std::string & arguments);
|
|
bool add_tool_call(const json & tool_call);
|
|
bool add_tool_calls(const json & arr);
|
|
void clear_tools();
|
|
|
|
// Parsing utilities
|
|
std::string consume_rest();
|
|
bool try_consume_literal(const std::string & literal);
|
|
void consume_literal(const std::string & literal);
|
|
bool try_parse_reasoning(const std::string & start_think, const std::string & end_think);
|
|
|
|
// Regex-based parsing methods (new)
|
|
std::optional<find_regex_result> try_find_regex(const common_regex & regex, size_t from = std::string::npos, bool add_prelude_to_content = true);
|
|
find_regex_result consume_regex(const common_regex & regex);
|
|
std::optional<find_regex_result> try_consume_regex(const common_regex & regex);
|
|
|
|
// Progressive parsing primitives (for Phase 4)
|
|
std::optional<find_regex_result> try_find_literal(const std::string & literal);
|
|
bool consume_spaces();
|
|
void set_healing_marker(const std::string & marker);
|
|
|
|
|
|
// Main parsing entry point
|
|
void parse();
|
|
|
|
// Finishing
|
|
void finish();
|
|
|
|
// Result extraction
|
|
common_chat_msg result_and_reset();
|
|
|
|
// Advanced JSON parsing (following original llama.cpp patterns)
|
|
struct consume_json_result {
|
|
json value;
|
|
bool is_partial;
|
|
};
|
|
|
|
std::optional<common_json> try_consume_json();
|
|
common_json consume_json();
|
|
consume_json_result consume_json_with_dumped_args(
|
|
const std::vector<std::vector<std::string>>& args_paths = {},
|
|
const std::vector<std::vector<std::string>>& content_paths = {}
|
|
);
|
|
std::optional<consume_json_result> try_consume_json_with_dumped_args(
|
|
const std::vector<std::vector<std::string>>& args_paths = {},
|
|
const std::vector<std::vector<std::string>>& content_paths = {}
|
|
);
|
|
|
|
private:
|
|
// Internal parsing helpers
|
|
void parse_kimi_k2_format();
|
|
void parse_deepseek_r1_format();
|
|
void parse_generic_format();
|
|
|
|
|
|
// JSON parsing utilities (enhanced streaming support)
|
|
struct json_parse_result {
|
|
json value;
|
|
bool success;
|
|
bool is_partial;
|
|
std::string healing_marker;
|
|
};
|
|
|
|
// Partial detection utilities
|
|
bool detect_partial_function_call(const std::string& content);
|
|
void handle_partial_detection();
|
|
|
|
// Legacy find_literal for compatibility
|
|
std::optional<find_regex_result> try_find_literal_legacy(const std::string & literal);
|
|
};
|
|
|
|
// Main parsing function (public API)
|
|
common_chat_msg common_chat_parse(const std::string & input, bool is_partial, const common_chat_syntax & syntax);
|
|
|
|
// Content-only parsing for fallback scenarios (static internal function)
|