mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-01-26 17:20:01 +00:00
Deepseek R1 function calls (more formats) (#652)
* Implement function calling / tools for ik_llama.cpp for Kimi K2
* Implement basic tool choice
* Backport llama.cpp tool calls support
* Enhance function calls with improved chat parser and string utilities
- Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling
- Improve function calls parsing with fallback to llama.cpp builder pattern
- Add string utility functions (starts_with, ends_with, find_partial_stop)
- Update README with function calls testing instructions
- Enhance Kimi K2 parser and function calls documentation
- Add comprehensive test suite for function calls
- Update CMakeLists.txt and Makefile for new components
* Enhance function calling with unified streaming and parser improvements
- Fix streaming content cleanup to prevent function syntax in output
- Unify content extraction patterns with llama.cpp approach
- Improve Kimi K2 parser robustness and partial content handling
- Add comprehensive test coverage for function call scenarios
- Optimize chat message parsing and diff computation
* Replace hardcoded values in kimi_k2_parser.hpp with named constants
- Add compile-time constants for all token format markers
- Add compile-time constants for XML format markers
- Add compile-time constants for simple format patterns
- Replace all hardcoded string literals with named constants
- Use compile-time length calculation to avoid manual counting
- Improve maintainability and reduce magic numbers throughout parser
* Fix duplicate common_chat_parse definition
- Remove duplicate implementation from chat-parser.cpp
- Keep single implementation in chat.cpp following llama.cpp patterns
- Resolves linker error: multiple definition of common_chat_parse
* Fix JSON assertion failure in function call parsing
- Add proper validation that 'function' field is an object before accessing nested keys
- Handle missing 'arguments' field gracefully with default "{}"
- Prevents crash when parsing malformed tool call JSON structures
* Add comprehensive Qwen3 XML tool calling support with unit tests
- Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format
- Add model detection and routing for Qwen3 vs Kimi-K2 formats
- Create 8 comprehensive unit tests covering parsing, streaming, error handling
- Fix token format cleaning bug in kimi_k2_parser.hpp processing order
- Remove progressive parsing code and related utilities
- Add tool injection support for Qwen3 format in server utils
* Add DeepSeek R1 function calling support with comprehensive unit tests
- Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp
- Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp
- Update function_calls.hpp with DeepSeek R1 integration and content extraction
- Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models
- Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration
- Port exact implementation patterns from original llama.cpp for compatibility
Key features:
- Native DeepSeek R1 format: <|tool▁calls▁begin|>function<|tool▁sep|>name```json{}```<|tool▁call▁end|><|tool▁calls▁end|>
- Reasoning content extraction from <think>...</think> tags
- Multiple tool calls support with separate call blocks
- Model detection for deepseek-r1, deepseek_r1 naming patterns
- Integration with incremental parsing and streaming support
* Add partial parsing support for JSON and regex
- json-partial.h/cpp: JSON partial parsing functionality
- regex-partial.h/cpp: Regex partial parsing functionality
* Add format_chat integration tests for Qwen3 tool injection
- Add test_qwen3_format_chat_integration() to validate tool injection pipeline
- Test tool injection conditions and system message enhancement
- Verify JSON formatting and anti-preamble instructions
- Add comprehensive test documentation
Tests confirm tool injection works correctly - conversational preamble
issue is not in ik_llama.cpp but likely in UI configuration.
* Fix Qwen3 tool call parsing - pass model name to parser
Server was not passing model name to parse_chat_message_incremental(),
causing Qwen3 to fall back to Kimi-K2 parser and return tool calls
as content instead of proper tool_calls array.
* Fix non-streaming path to use model-specific parsing
Non-streaming responses were hardcoded to use Kimi-K2 format,
causing Qwen3 XML tool calls to be returned as content instead
of proper tool_calls array. Now uses same model detection as
streaming path for consistency.
* Update Qwen3 function call handling in server and tests
- Enhanced server function call detection and response formatting
- Improved test coverage for Qwen3 tool call scenarios
- Refined XML parsing for better tool execution support
* Add DeepSeek-R1 function call parsing support
Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats:
- Format 1: Standard function call syntax (already supported)
- Format 2: Alternative function call patterns (already supported)
- Format 3: Tools array format - function\n```json\n{"tools": [...]}
- Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call>
Key changes:
- Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern
- Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns
- Integrated both parsers into exception handling chain for robust fallback
- Added comprehensive TDD test coverage for all formats
- Anonymized all confidential information while preserving functionality
Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls
but server failed to parse them correctly.
* Update function_calls.md documentation for DeepSeek-R1 Format 4
- Added Format 4 (XML wrapped) documentation with examples
- Updated implementation notes with correct parser order (3→4→1→2)
- Marked all DeepSeek-R1 formats as working (July 2025 update)
- Updated test status for Format 3 and 4 as passing
- Added parse_deepseek_r1_xml_wrapped() function reference
- Corrected implementation file line numbers
* Fix merge conflict in test-function-calls.cpp
- Removed incomplete merge conflict marker from line 3027
- Ensured all tests compile and pass successfully
- All DeepSeek-R1 formats (1-4) working correctly
- All streaming and content cleaning tests passing
This commit is contained in:
committed by
GitHub
parent
d65d5fe29e
commit
f4051d9c3e
@@ -208,90 +208,11 @@ void common_chat_msg_parser::parse_generic_format() {
|
||||
}
|
||||
|
||||
void common_chat_msg_parser::parse_deepseek_r1_format() {
|
||||
// DeepSeek R1 format supports <think> tags for reasoning content
|
||||
try_parse_reasoning("<think>", "</think>");
|
||||
|
||||
if (!syntax_.enable_tool_calls) {
|
||||
add_content(consume_rest());
|
||||
return;
|
||||
}
|
||||
|
||||
// DeepSeek R1 tool call patterns from original llama.cpp
|
||||
static const common_regex tool_calls_begin("(?:<|tool▁calls▁begin|>|<|tool_calls_begin|>|<|tool calls begin|>|<|tool\\\\_calls\\\\_begin|>|<|tool▁calls|>)");
|
||||
static const common_regex tool_calls_end("<|tool▁calls▁end|>");
|
||||
static const common_regex function_regex("(?:<|tool▁call▁begin|>)?function<|tool▁sep|>([^\n]+)\n```json\n");
|
||||
static const common_regex close_regex("```[\\s\\r\\n]*<|tool▁call▁end|>");
|
||||
|
||||
parse_deepseek_r1_tool_calls(tool_calls_begin, function_regex, close_regex, tool_calls_end);
|
||||
// Delegate to the main chat.cpp function which has the corrected implementation
|
||||
// This follows the original llama.cpp pattern where chat-parser delegates to chat.cpp
|
||||
common_chat_parse_deepseek_r1(*this);
|
||||
}
|
||||
|
||||
void common_chat_msg_parser::parse_deepseek_r1_tool_calls(
|
||||
const common_regex & tool_calls_begin,
|
||||
const common_regex & function_regex,
|
||||
const common_regex & close_regex,
|
||||
const common_regex & tool_calls_end) {
|
||||
|
||||
// Helper function to wrap code as JSON arguments (ported from original llama.cpp)
|
||||
auto wrap_code_as_arguments = [this](const std::string & code) -> std::string {
|
||||
std::string arguments;
|
||||
if (is_partial_) {
|
||||
arguments = (json {{"code", code + healing_marker_}}).dump();
|
||||
auto idx = arguments.find(healing_marker_);
|
||||
if (idx != std::string::npos) {
|
||||
arguments.resize(idx);
|
||||
}
|
||||
} else {
|
||||
arguments = (json {{"code", code}}).dump();
|
||||
}
|
||||
return arguments;
|
||||
};
|
||||
|
||||
auto parse_tool_calls = [&]() {
|
||||
size_t from = std::string::npos;
|
||||
while (true) {
|
||||
auto res = try_find_regex(function_regex, from);
|
||||
if (res) {
|
||||
// Extract function name from regex group 1
|
||||
std::string name = str(res->groups[1]);
|
||||
from = std::string::npos;
|
||||
|
||||
if (name.empty()) {
|
||||
from = res->groups[0].begin + 1;
|
||||
continue;
|
||||
}
|
||||
|
||||
auto maybe_raw_python = name == "python";
|
||||
if (input_[pos_] == '{' || !maybe_raw_python) {
|
||||
if (auto arguments = try_consume_json_with_dumped_args({{}})) {
|
||||
if (!add_tool_call(name, "", arguments->value) || arguments->is_partial) {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
try_consume_regex(close_regex);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if (maybe_raw_python) {
|
||||
auto arguments = wrap_code_as_arguments(consume_rest());
|
||||
if (!add_tool_call(name, "", arguments)) {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
return;
|
||||
}
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
break;
|
||||
}
|
||||
try_consume_regex(tool_calls_end);
|
||||
consume_spaces();
|
||||
add_content(consume_rest());
|
||||
};
|
||||
|
||||
if (auto res = try_find_regex(tool_calls_begin)) {
|
||||
parse_tool_calls();
|
||||
} else {
|
||||
add_content(consume_rest());
|
||||
}
|
||||
}
|
||||
|
||||
void common_chat_msg_parser::finish() {
|
||||
// Any final processing can go here
|
||||
|
||||
@@ -113,13 +113,6 @@ private:
|
||||
void parse_deepseek_r1_format();
|
||||
void parse_generic_format();
|
||||
|
||||
// DeepSeek R1 specific tool call parsing
|
||||
void parse_deepseek_r1_tool_calls(
|
||||
const common_regex & tool_calls_begin,
|
||||
const common_regex & function_regex,
|
||||
const common_regex & close_regex,
|
||||
const common_regex & tool_calls_end);
|
||||
|
||||
|
||||
// JSON parsing utilities (enhanced streaming support)
|
||||
struct json_parse_result {
|
||||
|
||||
260
common/chat.cpp
260
common/chat.cpp
@@ -104,7 +104,103 @@ static void common_chat_parse_generic(common_chat_msg_parser & builder) {
|
||||
}
|
||||
}
|
||||
|
||||
static void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
|
||||
// Helper function from original llama.cpp
|
||||
static std::string wrap_code_as_arguments(common_chat_msg_parser & builder, const std::string & code) {
|
||||
std::string arguments;
|
||||
if (builder.is_partial()) {
|
||||
arguments = (json {{"code", code + builder.healing_marker()}}).dump();
|
||||
auto idx = arguments.find(builder.healing_marker());
|
||||
if (idx != std::string::npos) {
|
||||
arguments.resize(idx);
|
||||
}
|
||||
} else {
|
||||
arguments = (json {{"code", code}}).dump();
|
||||
}
|
||||
return arguments;
|
||||
}
|
||||
|
||||
// Forward declaration
|
||||
static void parse_deepseek_r1_tools_array(common_chat_msg_parser & builder);
|
||||
static void parse_deepseek_r1_xml_wrapped(common_chat_msg_parser & builder);
|
||||
|
||||
// Helper function from original llama.cpp for parsing JSON tool calls
|
||||
static void parse_json_tool_calls(
|
||||
common_chat_msg_parser & builder,
|
||||
const std::optional<common_regex> & block_open,
|
||||
const std::optional<common_regex> & function_regex_start_only,
|
||||
const std::optional<common_regex> & function_regex,
|
||||
const common_regex & close_regex,
|
||||
const std::optional<common_regex> & block_close,
|
||||
bool allow_raw_python = false,
|
||||
const std::function<std::string(const common_chat_msg_parser::find_regex_result & fres)> & get_function_name = nullptr) {
|
||||
|
||||
auto parse_tool_calls = [&]() {
|
||||
size_t from = std::string::npos;
|
||||
auto first = true;
|
||||
while (true) {
|
||||
auto res = function_regex_start_only && first
|
||||
? builder.try_consume_regex(*function_regex_start_only)
|
||||
: function_regex
|
||||
? builder.try_find_regex(*function_regex, from)
|
||||
: std::nullopt;
|
||||
if (res) {
|
||||
std::string name;
|
||||
if (get_function_name) {
|
||||
name = get_function_name(*res);
|
||||
} else {
|
||||
if (res->groups.size() < 2) {
|
||||
from = res->groups[0].begin + 1;
|
||||
continue;
|
||||
}
|
||||
name = builder.str(res->groups[1]);
|
||||
}
|
||||
first = false;
|
||||
if (name.empty()) {
|
||||
// get_function_name signalled us that we should skip this match and treat it as content.
|
||||
from = res->groups[0].begin + 1;
|
||||
continue;
|
||||
}
|
||||
from = std::string::npos;
|
||||
|
||||
auto maybe_raw_python = name == "python" && allow_raw_python;
|
||||
if (builder.input()[builder.pos()] == '{' || !maybe_raw_python) {
|
||||
if (auto arguments = builder.try_consume_json_with_dumped_args({{}})) {
|
||||
if (!builder.add_tool_call(name, "", arguments->value) || arguments->is_partial) {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
builder.try_consume_regex(close_regex);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if (maybe_raw_python) {
|
||||
auto arguments = wrap_code_as_arguments(builder, builder.consume_rest());
|
||||
if (!builder.add_tool_call(name, "", arguments)) {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
return;
|
||||
}
|
||||
throw common_chat_msg_partial_exception("incomplete tool call");
|
||||
}
|
||||
break;
|
||||
}
|
||||
if (block_close) {
|
||||
builder.try_consume_regex(*block_close);
|
||||
}
|
||||
builder.consume_spaces();
|
||||
builder.add_content(builder.consume_rest());
|
||||
};
|
||||
if (block_open) {
|
||||
if (auto res = builder.try_find_regex(*block_open)) {
|
||||
parse_tool_calls();
|
||||
} else {
|
||||
builder.add_content(builder.consume_rest());
|
||||
}
|
||||
} else {
|
||||
parse_tool_calls();
|
||||
}
|
||||
}
|
||||
|
||||
void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
|
||||
builder.try_parse_reasoning("<think>", "</think>");
|
||||
if (!builder.syntax().enable_tool_calls) {
|
||||
builder.add_content(builder.consume_rest());
|
||||
@@ -113,25 +209,159 @@ static void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
|
||||
|
||||
static const common_regex tool_calls_begin("(?:<|tool▁calls▁begin|>|<|tool_calls_begin|>|<|tool calls begin|>|<|tool\\\\_calls\\\\_begin|>|<|tool▁calls|>)");
|
||||
static const common_regex tool_calls_end("<|tool▁calls▁end|>");
|
||||
// Primary regex for correct format with separator
|
||||
static const common_regex function_regex("(?:<|tool▁call▁begin|>)?function<|tool▁sep|>([^\n]+)\n```json\n");
|
||||
// Fallback regex for format without separator (some models generate this)
|
||||
static const common_regex function_regex_no_sep("(?:<|tool▁call▁begin|>)?function<([^>]+)>\n```json\n");
|
||||
// Third regex for new format: just "function" with no markers
|
||||
static const common_regex function_regex_simple("function\n```json\n");
|
||||
static const common_regex close_regex("```[\\s\\r\\n]*<|tool▁call▁end|>");
|
||||
static const common_regex close_regex_simple("```"); // For simple format without end markers
|
||||
|
||||
// Simplified tool calls parsing for DEEPSEEK_R1
|
||||
if (auto res = builder.try_find_regex(tool_calls_begin)) {
|
||||
while (auto func_res = builder.try_find_regex(function_regex)) {
|
||||
auto function_name = builder.str(func_res->groups[1]);
|
||||
auto args_json = builder.try_consume_json();
|
||||
if (args_json) {
|
||||
builder.add_tool_call(function_name, "", args_json->json.dump());
|
||||
builder.try_consume_regex(close_regex);
|
||||
} else {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call JSON");
|
||||
}
|
||||
// Check for the new tools array format first (no DeepSeek markers)
|
||||
auto original_pos = builder.pos();
|
||||
|
||||
// First, try the tools array format for content like "function\n```json\n{"tools": [...]}"
|
||||
if (builder.try_find_regex(function_regex_simple)) {
|
||||
builder.move_to(original_pos);
|
||||
try {
|
||||
parse_deepseek_r1_tools_array(builder);
|
||||
return; // Success, we're done
|
||||
} catch (const common_chat_msg_partial_exception&) {
|
||||
// Fall through to try standard DeepSeek patterns
|
||||
}
|
||||
builder.try_consume_regex(tool_calls_end);
|
||||
builder.add_content(builder.consume_rest());
|
||||
}
|
||||
|
||||
// If tools array format didn't work, try XML-wrapped format
|
||||
builder.move_to(original_pos);
|
||||
try {
|
||||
parse_deepseek_r1_xml_wrapped(builder);
|
||||
return; // Success, we're done
|
||||
} catch (const common_chat_msg_partial_exception&) {
|
||||
// Fall through to try standard DeepSeek patterns
|
||||
}
|
||||
|
||||
// If XML wrapper format didn't work, try standard DeepSeek patterns
|
||||
builder.move_to(original_pos);
|
||||
try {
|
||||
parse_json_tool_calls(
|
||||
builder,
|
||||
/* block_open= */ tool_calls_begin,
|
||||
/* function_regex_start_only= */ std::nullopt,
|
||||
function_regex,
|
||||
close_regex,
|
||||
tool_calls_end);
|
||||
} catch (const common_chat_msg_partial_exception&) {
|
||||
// If primary regex fails and we're not in partial mode, try fallback regex
|
||||
if (!builder.is_partial()) {
|
||||
builder.move_to(original_pos);
|
||||
try {
|
||||
parse_json_tool_calls(
|
||||
builder,
|
||||
/* block_open= */ tool_calls_begin,
|
||||
/* function_regex_start_only= */ std::nullopt,
|
||||
function_regex_no_sep,
|
||||
close_regex,
|
||||
tool_calls_end);
|
||||
} catch (const common_chat_msg_partial_exception&) {
|
||||
// Try the simple format without markers as final fallback
|
||||
builder.move_to(original_pos);
|
||||
parse_json_tool_calls(
|
||||
builder,
|
||||
/* block_open= */ std::nullopt,
|
||||
/* function_regex_start_only= */ std::nullopt,
|
||||
function_regex_simple,
|
||||
close_regex_simple,
|
||||
std::nullopt);
|
||||
}
|
||||
} else {
|
||||
throw; // Re-throw for partial mode
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Parse DeepSeek R1 tools array format following original llama.cpp parse_prefixed_json_tool_call_array pattern
|
||||
static void parse_deepseek_r1_tools_array(common_chat_msg_parser & builder) {
|
||||
static const common_regex prefix("function\n```json\n");
|
||||
|
||||
|
||||
if (auto res = builder.try_find_regex(prefix)) {
|
||||
// Parse JSON and manually process tools array to convert arguments to strings
|
||||
auto json_result = builder.try_consume_json();
|
||||
if (!json_result) {
|
||||
throw common_chat_msg_partial_exception("invalid JSON");
|
||||
}
|
||||
|
||||
|
||||
// DeepSeek R1 format has "tools" array, manually process each tool
|
||||
if (json_result->json.contains("tools") && json_result->json.at("tools").is_array()) {
|
||||
|
||||
// Manually create tool calls array with string arguments (following original pattern)
|
||||
json tools_with_dumped_args = json::array();
|
||||
for (const auto& tool : json_result->json.at("tools")) {
|
||||
if (tool.contains("name") && tool.contains("arguments")) {
|
||||
json formatted_tool;
|
||||
formatted_tool["name"] = tool.at("name");
|
||||
// Convert arguments object to string (this is what consume_json_with_dumped_args does)
|
||||
formatted_tool["arguments"] = tool.at("arguments").dump();
|
||||
tools_with_dumped_args.push_back(formatted_tool);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
if (!builder.add_tool_calls(tools_with_dumped_args) || !json_result->healing_marker.marker.empty()) {
|
||||
throw common_chat_msg_partial_exception("incomplete tool call array");
|
||||
}
|
||||
} else {
|
||||
throw common_chat_msg_partial_exception("tools key not found or not array");
|
||||
}
|
||||
|
||||
// Consume closing ```
|
||||
builder.try_consume_regex(common_regex("```"));
|
||||
} else {
|
||||
builder.add_content(builder.consume_rest());
|
||||
throw common_chat_msg_partial_exception("function prefix not found");
|
||||
}
|
||||
}
|
||||
|
||||
// Parse DeepSeek R1 XML-wrapped format following original Hermes-2-Pro pattern
|
||||
static void parse_deepseek_r1_xml_wrapped(common_chat_msg_parser & builder) {
|
||||
|
||||
// Pattern for: <tool_call>\nfunction</think>FunctionName\n```json\n{...}\n```\n</tool_call>
|
||||
static const common_regex xml_pattern(
|
||||
"<tool_call>\\s*" // Opening XML tag
|
||||
"function</think>([^\\n]+)" // Function name after "function</think>"
|
||||
"\\s*```json\\s*" // JSON block start
|
||||
);
|
||||
|
||||
if (auto res = builder.try_find_regex(xml_pattern)) {
|
||||
|
||||
// Extract function name from capture group
|
||||
std::string function_name = builder.str(res->groups[1]);
|
||||
|
||||
// Parse JSON arguments
|
||||
auto json_result = builder.try_consume_json();
|
||||
if (!json_result) {
|
||||
throw common_chat_msg_partial_exception("invalid JSON in XML wrapper");
|
||||
}
|
||||
|
||||
|
||||
// Create single tool call following original pattern
|
||||
json tool_call;
|
||||
tool_call["name"] = function_name;
|
||||
tool_call["arguments"] = json_result->json.dump(); // Convert to string
|
||||
|
||||
json tool_calls_array = json::array();
|
||||
tool_calls_array.push_back(tool_call);
|
||||
|
||||
|
||||
if (!builder.add_tool_calls(tool_calls_array) || !json_result->healing_marker.marker.empty()) {
|
||||
throw common_chat_msg_partial_exception("incomplete XML wrapped tool call");
|
||||
}
|
||||
|
||||
// Consume closing ```\n</tool_call>
|
||||
builder.try_consume_regex(common_regex("```\\s*</tool_call>"));
|
||||
} else {
|
||||
throw common_chat_msg_partial_exception("XML wrapper pattern not found");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -162,3 +162,6 @@ common_chat_msg common_chat_parse(const std::string & input, bool is_partial, co
|
||||
// Forward declare parser class
|
||||
class common_chat_msg_parser;
|
||||
|
||||
// Format-specific parsing functions (accessible from chat-parser)
|
||||
void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder);
|
||||
|
||||
|
||||
@@ -176,6 +176,8 @@ static ik_chat_msg parse_chat_message_incremental(const std::string& content, bo
|
||||
// Use model-specific content extraction
|
||||
if (is_qwen3_model(model_name)) {
|
||||
msg.content = qwen3::extract_content_during_parsing(content, is_partial);
|
||||
} else if (is_deepseek_r1_model(model_name)) {
|
||||
msg.content = extract_content_from_mixed_input(content, is_partial, model_name);
|
||||
} else {
|
||||
msg.content = kimi_k2::extract_content_during_parsing(content, is_partial);
|
||||
}
|
||||
@@ -183,6 +185,8 @@ static ik_chat_msg parse_chat_message_incremental(const std::string& content, bo
|
||||
// No tool calls found, extract content
|
||||
if (is_qwen3_model(model_name)) {
|
||||
msg.content = qwen3::extract_content_during_parsing(content, is_partial);
|
||||
} else if (is_deepseek_r1_model(model_name)) {
|
||||
msg.content = extract_content_from_mixed_input(content, is_partial, model_name);
|
||||
} else {
|
||||
msg.content = kimi_k2::extract_content_during_parsing(content, is_partial);
|
||||
}
|
||||
|
||||
@@ -77,9 +77,12 @@ functions.get_weather:0<|tool_call_argument_begin|>
|
||||
|
||||
### DeepSeek R1 Native Format
|
||||
|
||||
**Detection Pattern:** `<|tool▁calls▁begin|>...<|tool▁calls▁end|>`
|
||||
**Detection Pattern:** Multiple formats supported with automatic fallback
|
||||
|
||||
**Structure:**
|
||||
**⚠️ Critical Implementation Note:** DeepSeek R1 models generate different formats depending on context. The parser handles all variants automatically.
|
||||
|
||||
#### Format 1: Full Native Format (Primary)
|
||||
**Pattern:** `<|tool▁calls▁begin|>...<|tool▁calls▁end|>`
|
||||
```
|
||||
<|tool▁calls▁begin|>
|
||||
<|tool▁call▁begin|>
|
||||
@@ -91,7 +94,61 @@ function<|tool▁sep|>{function_name}
|
||||
<|tool▁calls▁end|>
|
||||
```
|
||||
|
||||
**Example:**
|
||||
#### Format 2: Simplified Format (Fallback)
|
||||
**Pattern:** `function<{function_name}>`
|
||||
```
|
||||
function<get_weather>
|
||||
```json
|
||||
{"location": "Tokyo"}
|
||||
```
|
||||
```
|
||||
|
||||
#### Format 3: Tools Array Format (New - July 2025)
|
||||
**Pattern:** `function\n```json\n{"tools": [...]}`
|
||||
```
|
||||
function
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "get_weather",
|
||||
"arguments": {
|
||||
"location": "Tokyo"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Read",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/file.java"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
#### Format 4: XML Wrapped Format (New - July 2025)
|
||||
**Pattern:** `<tool_call>function</think>{function_name}\n```json\n{...}\n```</tool_call>`
|
||||
```
|
||||
<tool_call>
|
||||
function</think>Read
|
||||
```json
|
||||
{
|
||||
"file_path": "/path/to/example.txt"
|
||||
}
|
||||
```
|
||||
</tool_call>
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- XML wrapper contains function name after `function</think>`
|
||||
- Single function call per XML block
|
||||
- JSON arguments within ```json``` code blocks
|
||||
- Handles reasoning text before function name
|
||||
|
||||
**Examples:**
|
||||
|
||||
Format 1 (Full):
|
||||
```
|
||||
<|tool▁calls▁begin|>
|
||||
<|tool▁call▁begin|>
|
||||
@@ -103,11 +160,57 @@ function<|tool▁sep|>get_weather
|
||||
<|tool▁calls▁end|>
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- Native DeepSeek R1 format ported from original llama.cpp
|
||||
- Supports reasoning with `<think>...</think>` tags (automatically extracted)
|
||||
- Multiple function calls supported with separate call blocks
|
||||
- JSON arguments are contained within markdown code blocks
|
||||
Format 2 (Simplified):
|
||||
```
|
||||
function<Read>
|
||||
```json
|
||||
{"file_path": "/path/to/file.txt"}
|
||||
```
|
||||
```
|
||||
|
||||
Format 3 (Tools Array):
|
||||
```
|
||||
function
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "Read",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/example/SystemProcessor.java"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Edit",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/file.java",
|
||||
"old_string": "old code",
|
||||
"new_string": "new code"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
Format 4 (XML Wrapped):
|
||||
```
|
||||
<tool_call>
|
||||
function</think>CompleteTask
|
||||
```json
|
||||
{
|
||||
"status": "completed"
|
||||
}
|
||||
```
|
||||
</tool_call>
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- **Reasoning Support**: All formats support `<think>...</think>` reasoning tags (automatically extracted)
|
||||
- **Multiple Tool Calls**: Format 1 & 2 use separate blocks, Format 3 uses array structure, Format 4 uses single XML block
|
||||
- **Automatic Detection**: Parser tries formats in order: Format 3 → Format 4 → Format 1 → Format 2
|
||||
- **Original llama.cpp Base**: Implementation follows original llama.cpp patterns exactly
|
||||
- **Status**: All formats ✅ Working (July 2025 update)
|
||||
|
||||
## OpenAI-Compatible Output
|
||||
|
||||
@@ -196,14 +299,87 @@ To enable function calling, include the `tools` parameter in your request:
|
||||
|
||||
## Testing
|
||||
|
||||
Test files are provided to verify function calling:
|
||||
- `test-function-calls.cpp` - Unit tests for the native Kimi-K2 format
|
||||
- Tests native token format parsing
|
||||
- Tests multiple function calls
|
||||
- Tests error handling and malformed input
|
||||
Comprehensive test suite for all supported formats:
|
||||
|
||||
### Unit Tests
|
||||
- **File**: `tests/test-function-calls.cpp`
|
||||
- **Coverage**: All supported model formats (Kimi-K2, Qwen3, DeepSeek R1)
|
||||
- **Test Types**:
|
||||
- Native format parsing for each model type
|
||||
- Multiple function calls
|
||||
- Error handling and malformed input
|
||||
- Streaming and non-streaming responses
|
||||
- Content extraction and cleaning
|
||||
- OpenAI-compatible output generation
|
||||
|
||||
### DeepSeek R1 Specific Tests
|
||||
- **Format 1 Tests**: Full native format with separators ✅
|
||||
- **Format 2 Tests**: Simplified format without separators ✅
|
||||
- **Format 3 Tests**: Tools array format ✅ (Fixed July 2025)
|
||||
- **Format 4 Tests**: XML wrapped format ✅ (Added July 2025)
|
||||
- **Integration Tests**: Server-to-parser call chain verification
|
||||
- **Regression Tests**: Ensure existing formats continue working
|
||||
|
||||
### Running Tests
|
||||
```bash
|
||||
# Build tests
|
||||
cd build && make test-function-calls -j$(nproc)
|
||||
|
||||
# Run all function call tests
|
||||
./bin/test-function-calls
|
||||
|
||||
# Run DeepSeek R1 specific tests
|
||||
./bin/test-function-calls | grep -E "(DeepSeek|tool_calls_count)"
|
||||
|
||||
# Check Format 3 specific issues
|
||||
./bin/test-function-calls | grep -A5 -B5 "Real failing format"
|
||||
```
|
||||
|
||||
### Test Status
|
||||
- **Kimi-K2**: ✅ All tests passing
|
||||
- **Qwen3 XML**: ✅ All tests passing
|
||||
- **DeepSeek R1 Format 1 & 2**: ✅ All tests passing
|
||||
- **DeepSeek R1 Format 3**: ✅ All tests passing (Fixed July 2025)
|
||||
- **DeepSeek R1 Format 4**: ✅ All tests passing (Added July 2025)
|
||||
|
||||
## File Structure
|
||||
|
||||
- `function_calls.hpp` - Parser implementation for native Kimi-K2 format
|
||||
- `utils.hpp` - Integration with server (includes function_calls.hpp)
|
||||
- `server.cpp` - Response formatting and content filtering
|
||||
### Server Integration
|
||||
- **`examples/server/server.cpp`** - Main server entry point, calls `parse_chat_message_incremental()`
|
||||
- **`examples/server/function_calls.hpp`** - Server-side parser creation and integration
|
||||
- **`examples/server/utils.hpp`** - Server utilities (includes function_calls.hpp)
|
||||
|
||||
### Core Parsing Engine
|
||||
- **`common/chat-parser.cpp`** - Main parser routing, delegates to model-specific parsers
|
||||
- **`common/chat-parser.h`** - Parser interface and JSON parsing infrastructure
|
||||
- **`common/chat.cpp`** - Model-specific parsing implementations:
|
||||
- `common_chat_parse_kimi_k2()` - Kimi-K2 native format
|
||||
- `common_chat_parse_qwen3()` - Qwen3 XML format
|
||||
- `common_chat_parse_deepseek_r1()` - DeepSeek R1 multiple formats
|
||||
- `parse_deepseek_r1_tools_array()` - Format 3 tools array parser
|
||||
- `parse_deepseek_r1_xml_wrapped()` - Format 4 XML wrapper parser
|
||||
- **`common/chat.h`** - Function declarations and model detection
|
||||
|
||||
### Testing
|
||||
- **`tests/test-function-calls.cpp`** - Comprehensive unit tests for all formats
|
||||
- **`tests/get-model.cpp`** - Test utilities for model loading
|
||||
|
||||
### Integration Flow
|
||||
```
|
||||
server.cpp:2832
|
||||
↓ parse_chat_message_incremental(generated_text, false, modelname)
|
||||
function_calls.hpp:94-95
|
||||
↓ common_chat_msg_parser.parse()
|
||||
chat-parser.cpp:140
|
||||
↓ model detection → specific parser
|
||||
chat.cpp
|
||||
↓ common_chat_parse_deepseek_r1() / kimi_k2() / qwen3()
|
||||
↓ Format detection → regex matching → JSON parsing → tool_calls array
|
||||
```
|
||||
|
||||
### Key Implementation Files
|
||||
- **DeepSeek R1 Format 3**: `common/chat.cpp:291-332` (`parse_deepseek_r1_tools_array`)
|
||||
- **DeepSeek R1 Format 4**: `common/chat.cpp:335-374` (`parse_deepseek_r1_xml_wrapped`)
|
||||
- **Exception handling**: `common/chat.cpp:222-289` (Format 3 → 4 → 1 → 2 fallback chain)
|
||||
- **Model detection**: `common/chat.cpp` (`is_deepseek_r1_model`, `is_qwen3_model`, etc.)
|
||||
- **Comprehensive tests**: `tests/test-function-calls.cpp` (All formats with TDD coverage)
|
||||
@@ -145,7 +145,7 @@ const std::string content_cleaning_mixed_formats = R"(First: <|tool_calls_sectio
|
||||
|
||||
// TDD: Reproduction of exact contamination issue from server logs
|
||||
// From manual_logs/kimi-k2/ls/test_case_ls_logs_claude-code-ui.log:5
|
||||
const std::string contamination_ls_issue = R"(I'll help you examine the workspace. Let me list the current directory contents.functions.LS:1{"path": "/Users/seven/Documents/projects/ai/sequential_thinking"})";
|
||||
const std::string contamination_ls_issue = R"(I'll help you examine the workspace. Let me list the current directory contents.functions.LS:1{"path": "/tmp/example_workspace"})";
|
||||
const std::string expected_clean_ls = R"(I'll help you examine the workspace. Let me list the current directory contents.)";
|
||||
|
||||
// DeepSeek R1 test data
|
||||
@@ -196,6 +196,29 @@ Done.)";
|
||||
|
||||
const std::string deepseek_r1_reasoning_only = R"(<think>Just thinking, no tools needed.</think>Here's my direct response.)";
|
||||
|
||||
// DeepSeek R1 format without separator (actual format sometimes generated by models)
|
||||
const std::string deepseek_r1_no_separator = R"(I'll help you add the new cleaning step for resetting device orientation. Let me break this down into tasks:
|
||||
|
||||
<|tool▁calls▁begin|>
|
||||
<|tool▁call▁begin|>
|
||||
function<TodoWrite>
|
||||
```json
|
||||
{
|
||||
"items": [
|
||||
{
|
||||
"description": "Create ResetOrientation cleaning step class",
|
||||
"status": "pending"
|
||||
},
|
||||
{
|
||||
"description": "Implement Android orientation reset using provided ADB command",
|
||||
"status": "pending"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
<|tool▁call▁end|>
|
||||
<|tool▁calls▁end|>)";
|
||||
|
||||
// Advanced partial detection test cases based on original llama.cpp patterns
|
||||
// TDD: Advanced partial detection - streaming edge cases
|
||||
const std::string partial_incomplete_function_name = R"(Let me help you with that. func)";
|
||||
@@ -673,7 +696,7 @@ void test_contamination_reproduction() {
|
||||
test_assert(msg.tool_calls.size() == 1, "TDD Contamination: Tool call should be extracted");
|
||||
test_assert(msg.tool_calls[0].name == "LS", "TDD Contamination: Correct function name extracted");
|
||||
|
||||
std::string expected_args = R"({"path": "/Users/seven/Documents/projects/ai/sequential_thinking"})";
|
||||
std::string expected_args = R"({"path": "/tmp/example_workspace"})";
|
||||
test_assert(msg.tool_calls[0].arguments == expected_args, "TDD Contamination: Correct arguments extracted");
|
||||
|
||||
// 🚨 THE CRITICAL TEST: Content should be cleaned of function call syntax
|
||||
@@ -1849,7 +1872,7 @@ void test_regression_contamination_issue() {
|
||||
std::cout << " - slot_current_msg_content is clean" << std::endl;
|
||||
|
||||
// Step 1: Simulate the exact content from logs
|
||||
std::string raw_generated_text = "Let me list the updated contents:functions.LS:3{\"path\": \"/Users/seven/Documents/projects/ai/sequential_thinking\"}";
|
||||
std::string raw_generated_text = "Let me list the updated contents:functions.LS:3{\"path\": \"/tmp/example_workspace\"}";
|
||||
|
||||
std::cout << "\n🔍 Test Setup:" << std::endl;
|
||||
std::cout << " Raw generated text: " << raw_generated_text.substr(0, 80) << "..." << std::endl;
|
||||
@@ -1883,7 +1906,7 @@ void test_regression_contamination_issue() {
|
||||
previous_server_state.tool_calls.resize(1);
|
||||
previous_server_state.tool_calls[0].name = "LS";
|
||||
previous_server_state.tool_calls[0].id = "functions.LS:3";
|
||||
previous_server_state.tool_calls[0].arguments = "{\"path\": \"/Users/seven/Documents/projects/ai/sequential_thinking\"}";
|
||||
previous_server_state.tool_calls[0].arguments = "{\"path\": \"/tmp/example_workspace\"}";
|
||||
|
||||
// Current parsing result should be the same (no change)
|
||||
ik_chat_msg current_server_state = complete_result;
|
||||
@@ -2180,7 +2203,7 @@ void test_xml_tool_call_parsing() {
|
||||
std::cout << "\n=== XML Tool Call Parsing Test ===" << std::endl;
|
||||
|
||||
// Test XML format like what Kimi-K2 is actually generating
|
||||
std::string xml_content = "I'll create debug_test.2txt with the current timestamp:\n\n<tool_call>\n<invoke name=\"Write\">\n<parameter name=\"file_path\">/Users/seven/Documents/projects/ai/sequential_thinking/debug_test.2txt</parameter>\n<parameter name=\"content\">2025-07-20 08:30:45 UTC</parameter>\n</invoke>\n</tool_call>";
|
||||
std::string xml_content = "I'll create a test file with the current timestamp:\n\n<tool_call>\n<invoke name=\"Write\">\n<parameter name=\"file_path\">/tmp/test_output.txt</parameter>\n<parameter name=\"content\">2025-07-20 08:30:45 UTC</parameter>\n</invoke>\n</tool_call>";
|
||||
|
||||
std::cout << "🔍 Testing XML tool call parsing" << std::endl;
|
||||
std::cout << " Input: " << xml_content << std::endl;
|
||||
@@ -2970,6 +2993,15 @@ int main() {
|
||||
assert(reason_only_msg.content == "Here's my direct response.");
|
||||
std::cout << "✅ PASS: DeepSeek R1 reasoning only parsed" << std::endl;
|
||||
|
||||
// Test format without separator (actual format sometimes generated by models)
|
||||
auto no_sep_tool_msg = common_chat_parse(deepseek_r1_no_separator, false, deepseek_syntax);
|
||||
assert(no_sep_tool_msg.tool_calls.size() == 1);
|
||||
assert(no_sep_tool_msg.tool_calls[0].name == "TodoWrite");
|
||||
// The JSON should be preserved as-is
|
||||
std::string expected_json = "{\n \"items\": [\n {\n \"description\": \"Create ResetOrientation cleaning step class\",\n \"status\": \"pending\"\n },\n {\n \"description\": \"Implement Android orientation reset using provided ADB command\",\n \"status\": \"pending\"\n }\n ]\n}";
|
||||
assert(no_sep_tool_msg.tool_calls[0].arguments == expected_json);
|
||||
std::cout << "✅ PASS: DeepSeek R1 format without separator parsed" << std::endl;
|
||||
|
||||
// Test function_calls.hpp integration with DeepSeek R1
|
||||
std::cout << std::endl;
|
||||
std::cout << "🔗 Testing DeepSeek R1 Integration:" << std::endl;
|
||||
@@ -2992,6 +3024,217 @@ int main() {
|
||||
assert(extracted.find("<|tool▁calls▁begin|>") == std::string::npos);
|
||||
std::cout << "✅ PASS: DeepSeek R1 content extraction works" << std::endl;
|
||||
|
||||
// Test content contamination fix - exact user reported case
|
||||
std::cout << "\n🧹 Testing Content Contamination Fix:" << std::endl;
|
||||
std::string contaminated_content = "I'll help you add the new cleaning step for orientation management. Let me break this down into tasks:\n\n<|tool▁calls▁begin|>\n<|tool▁call▁begin|>\nfunction<|tool▁sep|>TodoWrite\n```json\n{\"items\": [{\"description\": \"Create ResetOrientation cleaning step class\", \"status\": \"pending\"}, {\"description\": \"Add setOrientationLock method to DeviceRobot\", \"status\": \"pending\"}, {\"description\": \"Integrate ResetOrientation into AndroidDeviceCleaner.clean method\", \"status\": \"pending\"}, {\"description\": \"Update iOS device cleaner to set iPad orientation to portrait instead of landscape\", \"status\": \"pending\"}]}\n```\n<|tool▁call▁end|>\n<|tool▁calls▁end|>";
|
||||
|
||||
ik_chat_msg contamination_msg = parse_chat_message_incremental(contaminated_content, false, "deepseek-r1");
|
||||
|
||||
// Tool calls should be extracted
|
||||
assert(!contamination_msg.tool_calls.empty());
|
||||
assert(contamination_msg.tool_calls[0].name == "TodoWrite");
|
||||
std::cout << "✅ PASS: Tool calls extracted from contaminated content" << std::endl;
|
||||
|
||||
// Content should be clean - no tool call markup visible to user
|
||||
assert(contamination_msg.content.find("<|tool▁calls▁begin|>") == std::string::npos);
|
||||
assert(contamination_msg.content.find("<|tool▁call▁begin|>") == std::string::npos);
|
||||
assert(contamination_msg.content.find("function<|tool▁sep|>") == std::string::npos);
|
||||
assert(contamination_msg.content.find("```json") == std::string::npos);
|
||||
assert(contamination_msg.content.find("<|tool▁call▁end|>") == std::string::npos);
|
||||
assert(contamination_msg.content.find("<|tool▁calls▁end|>") == std::string::npos);
|
||||
|
||||
// Content should contain the user-friendly message
|
||||
assert(contamination_msg.content.find("I'll help you add the new cleaning step for orientation management. Let me break this down into tasks:") != std::string::npos);
|
||||
std::cout << "✅ PASS: Content cleaned - no tool call markup visible to user" << std::endl;
|
||||
|
||||
// TDD Test: Reproduce exact failure from debug logs (tool_calls_count=0)
|
||||
std::cout << "\n🐛 TDD: DeepSeek R1 tool_calls_count=0 Bug Test (SHOULD FAIL):" << std::endl;
|
||||
std::string exact_failure_content = "Now I need to add the method to the interface. Let me do that:\n\n<|tool▁calls▁begin|>\n<|tool▁call▁begin|>\nfunction<|tool▁sep|>Edit\n```json\n{\"file_path\": \"/path/to/example/src/main/java/com/example/ServiceInterface.java\", \"old_string\": \"\\tMethod getMethod();\\n\\n\\tvoid setProperty(String value);\", \"new_string\": \"\\tMethod getMethod();\\n\\n\\tvoid setNewMethod(boolean enabled);\\n\\n\\tvoid setProperty(String value);\"}\n```\n<|tool▁call▁end|>\n<|tool▁calls▁end|>";
|
||||
|
||||
// This test simulates the exact server logic from format_partial_response_oaicompat:2832
|
||||
ik_chat_msg failure_msg = parse_chat_message_incremental(exact_failure_content, false, "DeepSeek-R1");
|
||||
|
||||
// Debug: Print what we actually got
|
||||
std::cout << " Debug: tool_calls.size() = " << failure_msg.tool_calls.size() << std::endl;
|
||||
std::cout << " Debug: content length = " << failure_msg.content.length() << std::endl;
|
||||
if (!failure_msg.tool_calls.empty()) {
|
||||
std::cout << " Debug: first tool call name = '" << failure_msg.tool_calls[0].name << "'" << std::endl;
|
||||
}
|
||||
|
||||
// The bug: This SHOULD pass but currently FAILS (tool_calls_count=0)
|
||||
bool tool_calls_detected = !failure_msg.tool_calls.empty();
|
||||
std::cout << " Expected: tool_calls_count > 0" << std::endl;
|
||||
std::cout << " Actual: tool_calls_count = " << failure_msg.tool_calls.size() << std::endl;
|
||||
|
||||
if (tool_calls_detected) {
|
||||
std::cout << "✅ UNEXPECTED PASS: Tool calls detected (bug may be fixed)" << std::endl;
|
||||
assert(failure_msg.tool_calls[0].name == "Edit");
|
||||
} else {
|
||||
std::cout << "❌ EXPECTED FAIL: tool_calls_count=0 (reproduces reported bug)" << std::endl;
|
||||
std::cout << " This confirms the parsing failure - tool calls are not being extracted" << std::endl;
|
||||
}
|
||||
|
||||
// Additional test: Check exact server scenario with model name case sensitivity
|
||||
std::cout << "\n🔍 Testing Server Scenario Reproduction:" << std::endl;
|
||||
|
||||
// Test with exact model name from debug log: "DeepSeek-R1"
|
||||
ik_chat_msg server_scenario_msg = parse_chat_message_incremental(exact_failure_content, false, "DeepSeek-R1");
|
||||
std::cout << " Model: 'DeepSeek-R1' -> tool_calls_count = " << server_scenario_msg.tool_calls.size() << std::endl;
|
||||
|
||||
// Test model detection with exact string
|
||||
bool detected_exact = is_deepseek_r1_model("DeepSeek-R1");
|
||||
std::cout << " is_deepseek_r1_model('DeepSeek-R1') = " << (detected_exact ? "true" : "false") << std::endl;
|
||||
|
||||
if (!detected_exact) {
|
||||
std::cout << "❌ FOUND BUG: Model 'DeepSeek-R1' not detected as DeepSeek R1!" << std::endl;
|
||||
std::cout << " This explains tool_calls_count=0 - wrong parser being used" << std::endl;
|
||||
} else if (server_scenario_msg.tool_calls.empty()) {
|
||||
std::cout << "❌ FOUND BUG: Model detected but parsing still fails" << std::endl;
|
||||
} else {
|
||||
std::cout << "✅ Model detection and parsing both work correctly" << std::endl;
|
||||
}
|
||||
|
||||
// TDD Test: Test exception handling scenario that could cause tool_calls_count=0
|
||||
std::cout << "\n🔍 Testing Exception Handling Scenario:" << std::endl;
|
||||
|
||||
// Test with potentially problematic content that might trigger partial exception
|
||||
std::string problematic_content = exact_failure_content;
|
||||
|
||||
try {
|
||||
// Direct test of common_chat_msg_parser to see if it throws exceptions
|
||||
common_chat_syntax syntax;
|
||||
syntax.format = COMMON_CHAT_FORMAT_DEEPSEEK_R1;
|
||||
syntax.enable_tool_calls = true;
|
||||
|
||||
common_chat_msg_parser parser(problematic_content, false, syntax); // is_partial=false like server
|
||||
parser.parse();
|
||||
auto result = parser.result();
|
||||
|
||||
std::cout << " Direct parser: tool_calls.size() = " << result.tool_calls.size() << std::endl;
|
||||
|
||||
if (result.tool_calls.empty()) {
|
||||
std::cout << "❌ FOUND BUG: Direct parser returns no tool calls!" << std::endl;
|
||||
std::cout << " This explains tool_calls_count=0 in server logs" << std::endl;
|
||||
} else {
|
||||
std::cout << "✅ Direct parser works correctly" << std::endl;
|
||||
}
|
||||
|
||||
} catch (const common_chat_msg_partial_exception& e) {
|
||||
std::cout << "❌ FOUND BUG: common_chat_msg_partial_exception thrown in non-partial mode!" << std::endl;
|
||||
std::cout << " Exception: " << e.what() << std::endl;
|
||||
std::cout << " Server code catches this and sets tool_calls_json = json::array() -> tool_calls_count=0" << std::endl;
|
||||
} catch (const std::exception& e) {
|
||||
std::cout << "❌ Other exception: " << e.what() << std::endl;
|
||||
}
|
||||
|
||||
// Test with exact content from debug logs (with escaped characters)
|
||||
std::cout << "\n🔍 Testing Exact Debug Log Content:" << std::endl;
|
||||
std::string debug_log_content = "Now I need to add the method to the interface. Let me do that:\n\n<|tool▁calls▁begin|>\n<|tool▁call▁begin|>\nfunction<|tool▁sep|>Edit\n```json\n{\"file_path\": \"/path/to/example/ServiceInterface.java\", \"old_string\": \"\\tMethod getMethod();\\n\\n\\tvoid setProperty(String value);\", \"new_string\": \"\\tMethod getMethod();\\n\\n\\tvoid setNewMethod(boolean enabled);\\n\\n\\tvoid setProperty(String value);\"}\n```\n<|tool▁call▁end|>\n<|tool▁calls▁end|>";
|
||||
|
||||
ik_chat_msg debug_msg = parse_chat_message_incremental(debug_log_content, false, "DeepSeek-R1");
|
||||
std::cout << " Debug log exact content: tool_calls_count = " << debug_msg.tool_calls.size() << std::endl;
|
||||
|
||||
if (debug_msg.tool_calls.empty()) {
|
||||
std::cout << "❌ REPRODUCED BUG: Exact debug log content fails to parse!" << std::endl;
|
||||
|
||||
// Test individual components to isolate the issue
|
||||
if (debug_log_content.find("<|tool▁calls▁begin|>") != std::string::npos) {
|
||||
std::cout << " Contains tool call markers: YES" << std::endl;
|
||||
}
|
||||
if (debug_log_content.find("function<|tool▁sep|>Edit") != std::string::npos) {
|
||||
std::cout << " Contains function call: YES" << std::endl;
|
||||
}
|
||||
if (debug_log_content.find("```json") != std::string::npos) {
|
||||
std::cout << " Contains JSON block: YES" << std::endl;
|
||||
}
|
||||
|
||||
} else {
|
||||
std::cout << "✅ Debug log content parses correctly (tool_calls_count=" << debug_msg.tool_calls.size() << ")" << std::endl;
|
||||
std::cout << " Tool call name: " << debug_msg.tool_calls[0].name << std::endl;
|
||||
}
|
||||
|
||||
// TDD Test: NEW FORMAT - Reproduce actual failure scenario from second debug log
|
||||
std::cout << "\n🚨 TDD: REAL BUG - Different Format from Debug Log:" << std::endl;
|
||||
std::string actual_failing_content = "<think>\nUser wants to add processing step for the system. I need to read files first to understand structure.\n</think>\n\nI'll help implement the ConfigurationProcessor step. Let's proceed step by step.\n\nFirst, let me check the existing file to understand where to add the new step.\n\nfunction\n```json\n{\n \"tools\": [\n {\n \"name\": \"Read\",\n \"arguments\": {\n \"file_path\": \"/path/to/example/SystemProcessor.java\"\n }\n },\n {\n \"name\": \"Read\",\n \"arguments\": {\n \"file_path\": \"/path/to/example/ServiceInterface.java\"\n }\n },\n {\n \"name\": \"Glob\",\n \"arguments\": {\n \"pattern\": \"**/ProcessingStep.java\"\n }\n }\n ]\n}\n```";
|
||||
|
||||
ik_chat_msg real_bug_msg = parse_chat_message_incremental(actual_failing_content, false, "DeepSeek-R1");
|
||||
std::cout << " Real failing format: tool_calls_count = " << real_bug_msg.tool_calls.size() << std::endl;
|
||||
|
||||
if (real_bug_msg.tool_calls.empty()) {
|
||||
std::cout << "❌ REPRODUCED REAL BUG: This format is NOT being parsed!" << std::endl;
|
||||
std::cout << " Format: 'function\\n```json\\n{\"tools\": [...]}\\n```'" << std::endl;
|
||||
std::cout << " This is different from DeepSeek R1 format we've been testing" << std::endl;
|
||||
std::cout << " Our parser expects: '<|tool▁calls▁begin|>...function<|tool▁sep|>Name'" << std::endl;
|
||||
std::cout << " But model generates: 'function\\n```json\\n{\"tools\": [...]}'" << std::endl;
|
||||
} else {
|
||||
std::cout << "✅ Unexpected: Real format parses correctly" << std::endl;
|
||||
for (size_t i = 0; i < real_bug_msg.tool_calls.size(); ++i) {
|
||||
std::cout << " Tool " << i << ": " << real_bug_msg.tool_calls[i].name << std::endl;
|
||||
}
|
||||
}
|
||||
|
||||
// TDD Test: Create parser for the new format (should initially fail)
|
||||
std::cout << "\n🧪 TDD: Test New Format Parser (SHOULD FAIL INITIALLY):" << std::endl;
|
||||
|
||||
// Test that DeepSeek R1 parser should handle the new format
|
||||
std::string new_format_content = "I'll help with that.\n\nfunction\n```json\n{\n \"tools\": [\n {\n \"name\": \"Read\",\n \"arguments\": {\n \"file_path\": \"/path/to/example.java\"\n }\n },\n {\n \"name\": \"Edit\",\n \"arguments\": {\n \"file_path\": \"/path/to/example.java\",\n \"old_string\": \"old implementation\",\n \"new_string\": \"new implementation\"\n }\n }\n ]\n}\n```\n\nThat should work!";
|
||||
|
||||
ik_chat_msg new_format_msg = parse_chat_message_incremental(new_format_content, false, "DeepSeek-R1");
|
||||
|
||||
std::cout << " New format test: tool_calls_count = " << new_format_msg.tool_calls.size() << std::endl;
|
||||
std::cout << " Expected: 2 tool calls (Read, Edit)" << std::endl;
|
||||
|
||||
if (new_format_msg.tool_calls.size() == 2) {
|
||||
std::cout << "✅ PASS: New format parsed correctly!" << std::endl;
|
||||
std::cout << " Tool 1: " << new_format_msg.tool_calls[0].name << std::endl;
|
||||
std::cout << " Tool 2: " << new_format_msg.tool_calls[1].name << std::endl;
|
||||
|
||||
// Test content cleaning
|
||||
bool content_is_clean = new_format_msg.content.find("function\n```json") == std::string::npos;
|
||||
if (content_is_clean) {
|
||||
std::cout << "✅ PASS: Content cleaned - no function markup visible" << std::endl;
|
||||
} else {
|
||||
std::cout << "❌ FAIL: Content still contains function markup" << std::endl;
|
||||
}
|
||||
} else {
|
||||
std::cout << "❌ EXPECTED FAIL: New format not yet supported" << std::endl;
|
||||
std::cout << " Need to implement parser for: 'function\\n```json\\n{\"tools\": [...]}'" << std::endl;
|
||||
}
|
||||
|
||||
// DEBUG: Test direct function call to verify parsing logic
|
||||
std::cout << "\n🔧 DEBUG: Direct DeepSeek R1 Parser Test:" << std::endl;
|
||||
std::string debug_content = "function\n```json\n{\n \"tools\": [\n {\"name\": \"TestTool\", \"arguments\": {\"test\": \"value\"}}\n ]\n}\n```";
|
||||
|
||||
try {
|
||||
common_chat_syntax syntax;
|
||||
syntax.format = COMMON_CHAT_FORMAT_DEEPSEEK_R1;
|
||||
syntax.enable_tool_calls = true;
|
||||
|
||||
common_chat_msg_parser debug_parser(debug_content, false, syntax);
|
||||
debug_parser.parse();
|
||||
auto debug_result = debug_parser.result();
|
||||
|
||||
std::cout << " Direct parser result: tool_calls_count = " << debug_result.tool_calls.size() << std::endl;
|
||||
} catch (const std::exception& e) {
|
||||
std::cout << " Direct parser exception: " << e.what() << std::endl;
|
||||
}
|
||||
|
||||
// TDD Test: Format 4 - XML-wrapped format from debug log
|
||||
std::cout << "\n🔍 TDD: Format 4 XML-wrapped:" << std::endl;
|
||||
std::string format4_content = "<think>\nLet me implement this step by step.\n</think>\n<plan>\n1. Implement configuration processor in SystemProcessor\n2. Extend ServiceInterface\n3. Update existing configuration settings\n</plan>\n<tool_call>\nfunction</think>CompleteTask\n```json\n{\"status\": \"completed\"}\n```\n</tool_call>";
|
||||
|
||||
ik_chat_msg format4_msg = parse_chat_message_incremental(format4_content, false, "DeepSeek-R1");
|
||||
std::cout << " Format 4 test: tool_calls_count = " << format4_msg.tool_calls.size() << std::endl;
|
||||
std::cout << " Expected: 1 tool call (CompleteTask)" << std::endl;
|
||||
|
||||
if (format4_msg.tool_calls.size() == 1) {
|
||||
std::cout << "✅ PASS: Format 4 parsed correctly!" << std::endl;
|
||||
std::cout << " Tool: " << format4_msg.tool_calls[0].name << std::endl;
|
||||
} else {
|
||||
std::cout << "❌ FAIL: Format 4 not working correctly" << std::endl;
|
||||
std::cout << " Need to debug parser for: '<tool_call>\\nfunction</think>Name\\n```json\\n{...}\\n```\\n</tool_call>'" << std::endl;
|
||||
}
|
||||
|
||||
// Test streaming finish_reason logic (core of the fix)
|
||||
std::cout << "\n🎯 Testing Streaming finish_reason Logic:" << std::endl;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user