mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-13 15:30:03 +00:00
Deepseek R1 function calls (more formats) (#652)
* Implement function calling / tools for ik_llama.cpp for Kimi K2
* Implement basic tool choice
* Backport llama.cpp tool calls support
* Enhance function calls with improved chat parser and string utilities
- Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling
- Improve function calls parsing with fallback to llama.cpp builder pattern
- Add string utility functions (starts_with, ends_with, find_partial_stop)
- Update README with function calls testing instructions
- Enhance Kimi K2 parser and function calls documentation
- Add comprehensive test suite for function calls
- Update CMakeLists.txt and Makefile for new components
* Enhance function calling with unified streaming and parser improvements
- Fix streaming content cleanup to prevent function syntax in output
- Unify content extraction patterns with llama.cpp approach
- Improve Kimi K2 parser robustness and partial content handling
- Add comprehensive test coverage for function call scenarios
- Optimize chat message parsing and diff computation
* Replace hardcoded values in kimi_k2_parser.hpp with named constants
- Add compile-time constants for all token format markers
- Add compile-time constants for XML format markers
- Add compile-time constants for simple format patterns
- Replace all hardcoded string literals with named constants
- Use compile-time length calculation to avoid manual counting
- Improve maintainability and reduce magic numbers throughout parser
* Fix duplicate common_chat_parse definition
- Remove duplicate implementation from chat-parser.cpp
- Keep single implementation in chat.cpp following llama.cpp patterns
- Resolves linker error: multiple definition of common_chat_parse
* Fix JSON assertion failure in function call parsing
- Add proper validation that 'function' field is an object before accessing nested keys
- Handle missing 'arguments' field gracefully with default "{}"
- Prevents crash when parsing malformed tool call JSON structures
* Add comprehensive Qwen3 XML tool calling support with unit tests
- Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format
- Add model detection and routing for Qwen3 vs Kimi-K2 formats
- Create 8 comprehensive unit tests covering parsing, streaming, error handling
- Fix token format cleaning bug in kimi_k2_parser.hpp processing order
- Remove progressive parsing code and related utilities
- Add tool injection support for Qwen3 format in server utils
* Add DeepSeek R1 function calling support with comprehensive unit tests
- Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp
- Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp
- Update function_calls.hpp with DeepSeek R1 integration and content extraction
- Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models
- Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration
- Port exact implementation patterns from original llama.cpp for compatibility
Key features:
- Native DeepSeek R1 format: <|tool▁calls▁begin|>function<|tool▁sep|>name```json{}```<|tool▁call▁end|><|tool▁calls▁end|>
- Reasoning content extraction from <think>...</think> tags
- Multiple tool calls support with separate call blocks
- Model detection for deepseek-r1, deepseek_r1 naming patterns
- Integration with incremental parsing and streaming support
* Add partial parsing support for JSON and regex
- json-partial.h/cpp: JSON partial parsing functionality
- regex-partial.h/cpp: Regex partial parsing functionality
* Add format_chat integration tests for Qwen3 tool injection
- Add test_qwen3_format_chat_integration() to validate tool injection pipeline
- Test tool injection conditions and system message enhancement
- Verify JSON formatting and anti-preamble instructions
- Add comprehensive test documentation
Tests confirm tool injection works correctly - conversational preamble
issue is not in ik_llama.cpp but likely in UI configuration.
* Fix Qwen3 tool call parsing - pass model name to parser
Server was not passing model name to parse_chat_message_incremental(),
causing Qwen3 to fall back to Kimi-K2 parser and return tool calls
as content instead of proper tool_calls array.
* Fix non-streaming path to use model-specific parsing
Non-streaming responses were hardcoded to use Kimi-K2 format,
causing Qwen3 XML tool calls to be returned as content instead
of proper tool_calls array. Now uses same model detection as
streaming path for consistency.
* Update Qwen3 function call handling in server and tests
- Enhanced server function call detection and response formatting
- Improved test coverage for Qwen3 tool call scenarios
- Refined XML parsing for better tool execution support
* Add DeepSeek-R1 function call parsing support
Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats:
- Format 1: Standard function call syntax (already supported)
- Format 2: Alternative function call patterns (already supported)
- Format 3: Tools array format - function\n```json\n{"tools": [...]}
- Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call>
Key changes:
- Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern
- Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns
- Integrated both parsers into exception handling chain for robust fallback
- Added comprehensive TDD test coverage for all formats
- Anonymized all confidential information while preserving functionality
Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls
but server failed to parse them correctly.
* Update function_calls.md documentation for DeepSeek-R1 Format 4
- Added Format 4 (XML wrapped) documentation with examples
- Updated implementation notes with correct parser order (3→4→1→2)
- Marked all DeepSeek-R1 formats as working (July 2025 update)
- Updated test status for Format 3 and 4 as passing
- Added parse_deepseek_r1_xml_wrapped() function reference
- Corrected implementation file line numbers
* Fix merge conflict in test-function-calls.cpp
- Removed incomplete merge conflict marker from line 3027
- Ensured all tests compile and pass successfully
- All DeepSeek-R1 formats (1-4) working correctly
- All streaming and content cleaning tests passing
This commit is contained in:
committed by
GitHub
parent
d65d5fe29e
commit
f4051d9c3e
@@ -77,9 +77,12 @@ functions.get_weather:0<|tool_call_argument_begin|>
|
||||
|
||||
### DeepSeek R1 Native Format
|
||||
|
||||
**Detection Pattern:** `<|tool▁calls▁begin|>...<|tool▁calls▁end|>`
|
||||
**Detection Pattern:** Multiple formats supported with automatic fallback
|
||||
|
||||
**Structure:**
|
||||
**⚠️ Critical Implementation Note:** DeepSeek R1 models generate different formats depending on context. The parser handles all variants automatically.
|
||||
|
||||
#### Format 1: Full Native Format (Primary)
|
||||
**Pattern:** `<|tool▁calls▁begin|>...<|tool▁calls▁end|>`
|
||||
```
|
||||
<|tool▁calls▁begin|>
|
||||
<|tool▁call▁begin|>
|
||||
@@ -91,7 +94,61 @@ function<|tool▁sep|>{function_name}
|
||||
<|tool▁calls▁end|>
|
||||
```
|
||||
|
||||
**Example:**
|
||||
#### Format 2: Simplified Format (Fallback)
|
||||
**Pattern:** `function<{function_name}>`
|
||||
```
|
||||
function<get_weather>
|
||||
```json
|
||||
{"location": "Tokyo"}
|
||||
```
|
||||
```
|
||||
|
||||
#### Format 3: Tools Array Format (New - July 2025)
|
||||
**Pattern:** `function\n```json\n{"tools": [...]}`
|
||||
```
|
||||
function
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "get_weather",
|
||||
"arguments": {
|
||||
"location": "Tokyo"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Read",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/file.java"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
#### Format 4: XML Wrapped Format (New - July 2025)
|
||||
**Pattern:** `<tool_call>function</think>{function_name}\n```json\n{...}\n```</tool_call>`
|
||||
```
|
||||
<tool_call>
|
||||
function</think>Read
|
||||
```json
|
||||
{
|
||||
"file_path": "/path/to/example.txt"
|
||||
}
|
||||
```
|
||||
</tool_call>
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- XML wrapper contains function name after `function</think>`
|
||||
- Single function call per XML block
|
||||
- JSON arguments within ```json``` code blocks
|
||||
- Handles reasoning text before function name
|
||||
|
||||
**Examples:**
|
||||
|
||||
Format 1 (Full):
|
||||
```
|
||||
<|tool▁calls▁begin|>
|
||||
<|tool▁call▁begin|>
|
||||
@@ -103,11 +160,57 @@ function<|tool▁sep|>get_weather
|
||||
<|tool▁calls▁end|>
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
- Native DeepSeek R1 format ported from original llama.cpp
|
||||
- Supports reasoning with `<think>...</think>` tags (automatically extracted)
|
||||
- Multiple function calls supported with separate call blocks
|
||||
- JSON arguments are contained within markdown code blocks
|
||||
Format 2 (Simplified):
|
||||
```
|
||||
function<Read>
|
||||
```json
|
||||
{"file_path": "/path/to/file.txt"}
|
||||
```
|
||||
```
|
||||
|
||||
Format 3 (Tools Array):
|
||||
```
|
||||
function
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "Read",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/example/SystemProcessor.java"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Edit",
|
||||
"arguments": {
|
||||
"file_path": "/path/to/file.java",
|
||||
"old_string": "old code",
|
||||
"new_string": "new code"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
Format 4 (XML Wrapped):
|
||||
```
|
||||
<tool_call>
|
||||
function</think>CompleteTask
|
||||
```json
|
||||
{
|
||||
"status": "completed"
|
||||
}
|
||||
```
|
||||
</tool_call>
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- **Reasoning Support**: All formats support `<think>...</think>` reasoning tags (automatically extracted)
|
||||
- **Multiple Tool Calls**: Format 1 & 2 use separate blocks, Format 3 uses array structure, Format 4 uses single XML block
|
||||
- **Automatic Detection**: Parser tries formats in order: Format 3 → Format 4 → Format 1 → Format 2
|
||||
- **Original llama.cpp Base**: Implementation follows original llama.cpp patterns exactly
|
||||
- **Status**: All formats ✅ Working (July 2025 update)
|
||||
|
||||
## OpenAI-Compatible Output
|
||||
|
||||
@@ -196,14 +299,87 @@ To enable function calling, include the `tools` parameter in your request:
|
||||
|
||||
## Testing
|
||||
|
||||
Test files are provided to verify function calling:
|
||||
- `test-function-calls.cpp` - Unit tests for the native Kimi-K2 format
|
||||
- Tests native token format parsing
|
||||
- Tests multiple function calls
|
||||
- Tests error handling and malformed input
|
||||
Comprehensive test suite for all supported formats:
|
||||
|
||||
### Unit Tests
|
||||
- **File**: `tests/test-function-calls.cpp`
|
||||
- **Coverage**: All supported model formats (Kimi-K2, Qwen3, DeepSeek R1)
|
||||
- **Test Types**:
|
||||
- Native format parsing for each model type
|
||||
- Multiple function calls
|
||||
- Error handling and malformed input
|
||||
- Streaming and non-streaming responses
|
||||
- Content extraction and cleaning
|
||||
- OpenAI-compatible output generation
|
||||
|
||||
### DeepSeek R1 Specific Tests
|
||||
- **Format 1 Tests**: Full native format with separators ✅
|
||||
- **Format 2 Tests**: Simplified format without separators ✅
|
||||
- **Format 3 Tests**: Tools array format ✅ (Fixed July 2025)
|
||||
- **Format 4 Tests**: XML wrapped format ✅ (Added July 2025)
|
||||
- **Integration Tests**: Server-to-parser call chain verification
|
||||
- **Regression Tests**: Ensure existing formats continue working
|
||||
|
||||
### Running Tests
|
||||
```bash
|
||||
# Build tests
|
||||
cd build && make test-function-calls -j$(nproc)
|
||||
|
||||
# Run all function call tests
|
||||
./bin/test-function-calls
|
||||
|
||||
# Run DeepSeek R1 specific tests
|
||||
./bin/test-function-calls | grep -E "(DeepSeek|tool_calls_count)"
|
||||
|
||||
# Check Format 3 specific issues
|
||||
./bin/test-function-calls | grep -A5 -B5 "Real failing format"
|
||||
```
|
||||
|
||||
### Test Status
|
||||
- **Kimi-K2**: ✅ All tests passing
|
||||
- **Qwen3 XML**: ✅ All tests passing
|
||||
- **DeepSeek R1 Format 1 & 2**: ✅ All tests passing
|
||||
- **DeepSeek R1 Format 3**: ✅ All tests passing (Fixed July 2025)
|
||||
- **DeepSeek R1 Format 4**: ✅ All tests passing (Added July 2025)
|
||||
|
||||
## File Structure
|
||||
|
||||
- `function_calls.hpp` - Parser implementation for native Kimi-K2 format
|
||||
- `utils.hpp` - Integration with server (includes function_calls.hpp)
|
||||
- `server.cpp` - Response formatting and content filtering
|
||||
### Server Integration
|
||||
- **`examples/server/server.cpp`** - Main server entry point, calls `parse_chat_message_incremental()`
|
||||
- **`examples/server/function_calls.hpp`** - Server-side parser creation and integration
|
||||
- **`examples/server/utils.hpp`** - Server utilities (includes function_calls.hpp)
|
||||
|
||||
### Core Parsing Engine
|
||||
- **`common/chat-parser.cpp`** - Main parser routing, delegates to model-specific parsers
|
||||
- **`common/chat-parser.h`** - Parser interface and JSON parsing infrastructure
|
||||
- **`common/chat.cpp`** - Model-specific parsing implementations:
|
||||
- `common_chat_parse_kimi_k2()` - Kimi-K2 native format
|
||||
- `common_chat_parse_qwen3()` - Qwen3 XML format
|
||||
- `common_chat_parse_deepseek_r1()` - DeepSeek R1 multiple formats
|
||||
- `parse_deepseek_r1_tools_array()` - Format 3 tools array parser
|
||||
- `parse_deepseek_r1_xml_wrapped()` - Format 4 XML wrapper parser
|
||||
- **`common/chat.h`** - Function declarations and model detection
|
||||
|
||||
### Testing
|
||||
- **`tests/test-function-calls.cpp`** - Comprehensive unit tests for all formats
|
||||
- **`tests/get-model.cpp`** - Test utilities for model loading
|
||||
|
||||
### Integration Flow
|
||||
```
|
||||
server.cpp:2832
|
||||
↓ parse_chat_message_incremental(generated_text, false, modelname)
|
||||
function_calls.hpp:94-95
|
||||
↓ common_chat_msg_parser.parse()
|
||||
chat-parser.cpp:140
|
||||
↓ model detection → specific parser
|
||||
chat.cpp
|
||||
↓ common_chat_parse_deepseek_r1() / kimi_k2() / qwen3()
|
||||
↓ Format detection → regex matching → JSON parsing → tool_calls array
|
||||
```
|
||||
|
||||
### Key Implementation Files
|
||||
- **DeepSeek R1 Format 3**: `common/chat.cpp:291-332` (`parse_deepseek_r1_tools_array`)
|
||||
- **DeepSeek R1 Format 4**: `common/chat.cpp:335-374` (`parse_deepseek_r1_xml_wrapped`)
|
||||
- **Exception handling**: `common/chat.cpp:222-289` (Format 3 → 4 → 1 → 2 fallback chain)
|
||||
- **Model detection**: `common/chat.cpp` (`is_deepseek_r1_model`, `is_qwen3_model`, etc.)
|
||||
- **Comprehensive tests**: `tests/test-function-calls.cpp` (All formats with TDD coverage)
|
||||
Reference in New Issue
Block a user