# Function Calling Support
This document describes the function calling format supported by the ik_llama.cpp server implementation.
## Overview
The server supports multiple native function calling formats including Kimi-K2, Qwen3 (XML), and DeepSeek R1. All function calls are automatically detected and converted to OpenAI-compatible responses.
**⚠️ Model Requirements**: Function calling support is enabled for the following model types:
- **Kimi-K2 models**: Models containing "kimi-k2" or "kimi_k2" in the model name
- **Qwen3 models**: Models containing "qwen3", "qwen-3", or "qwen_3" in the model name
- **DeepSeek R1 models**: Models containing "deepseek-r1", "deepseek_r1", or similar patterns
Other models will not have tool injection or function call parsing enabled.
## Supported Formats
### Kimi-K2 Native Token Format
**Detection Pattern:** `<|tool_calls_section_begin|>...<|tool_calls_section_end|>`
**Structure:**
```
<|tool_calls_section_begin|>
<|tool_call_begin|>
functions.{name}:{index}<|tool_call_argument_begin|>
{JSON arguments}
<|tool_call_end|>
<|tool_calls_section_end|>
```
**Example:**
```
<|tool_calls_section_begin|>
<|tool_call_begin|>
functions.get_weather:0<|tool_call_argument_begin|>
{"location": "Tokyo"}
<|tool_call_end|>
<|tool_calls_section_end|>
```
**Notes:**
- Native Kimi-K2 token format
- Multiple function calls supported with different indices
- Arguments are JSON objects
- Function names follow `functions.{name}:{index}` pattern
### XML-Style Format (Fallback)
**Detection Pattern:** `............`
**Structure:**
```xml
{param_value}
{param_value}
```
**Example:**
```xml
/path/to/file.txt
File content here
```
**Notes:**
- XML-style format as fallback when model generates this format instead of token format
- Parameters are extracted as key-value pairs
- Automatically converted to JSON arguments
### DeepSeek R1 Native Format
**Detection Pattern:** Multiple formats supported with automatic fallback
**⚠️ Critical Implementation Note:** DeepSeek R1 models generate different formats depending on context. The parser handles all variants automatically.
#### Format 1: Full Native Format (Primary)
**Pattern:** `<|tool▁calls▁begin|>...<|tool▁calls▁end|>`
```
<|tool▁calls▁begin|>
<|tool▁call▁begin|>
function<|tool▁sep|>{function_name}
```json
{JSON arguments}
```
<|tool▁call▁end|>
<|tool▁calls▁end|>
```
#### Format 2: Simplified Format (Fallback)
**Pattern:** `function<{function_name}>`
```
function
```json
{"location": "Tokyo"}
```
```
#### Format 3: Tools Array Format (New - July 2025)
**Pattern:** `function\n```json\n{"tools": [...]}`
```
function
```json
{
"tools": [
{
"name": "get_weather",
"arguments": {
"location": "Tokyo"
}
},
{
"name": "Read",
"arguments": {
"file_path": "/path/to/file.java"
}
}
]
}
```
```
#### Format 4: XML Wrapped Format (New - July 2025)
**Pattern:** `function{function_name}\n```json\n{...}\n````
```
functionRead
```json
{
"file_path": "/path/to/example.txt"
}
```
```
**Notes:**
- XML wrapper contains function name after `function`
- Single function call per XML block
- JSON arguments within ```json``` code blocks
- Handles reasoning text before function name
**Examples:**
Format 1 (Full):
```
<|tool▁calls▁begin|>
<|tool▁call▁begin|>
function<|tool▁sep|>get_weather
```json
{"location": "Tokyo"}
```
<|tool▁call▁end|>
<|tool▁calls▁end|>
```
Format 2 (Simplified):
```
function
```json
{"file_path": "/path/to/file.txt"}
```
```
Format 3 (Tools Array):
```
function
```json
{
"tools": [
{
"name": "Read",
"arguments": {
"file_path": "/path/to/example/SystemProcessor.java"
}
},
{
"name": "Edit",
"arguments": {
"file_path": "/path/to/file.java",
"old_string": "old code",
"new_string": "new code"
}
}
]
}
```
```
Format 4 (XML Wrapped):
```
functionCompleteTask
```json
{
"status": "completed"
}
```
```
**Implementation Notes:**
- **Reasoning Support**: All formats support `...` reasoning tags (automatically extracted)
- **Multiple Tool Calls**: Format 1 & 2 use separate blocks, Format 3 uses array structure, Format 4 uses single XML block
- **Automatic Detection**: Parser tries formats in order: Format 3 → Format 4 → Format 1 → Format 2
- **Original llama.cpp Base**: Implementation follows original llama.cpp patterns exactly
- **Status**: All formats ✅ Working (July 2025 update)
## OpenAI-Compatible Output
The native format is converted to the standard OpenAI function calling response:
```json
{
"choices": [
{
"finish_reason": "tool_calls",
"message": {
"role": "assistant",
"content": "filtered_content_without_function_calls",
"tool_calls": [
{
"id": "functions.get_weather:0",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Tokyo\"}"
}
}
]
}
}
]
}
```
## Implementation Details
### Content Filtering
When function calls are detected:
- Function call syntax is removed from content
- Tool calls are extracted into separate array
- Content is cleaned for display
### Error Handling
- Missing tokens in format returns empty array
- Malformed structure returns empty array
- Parser gracefully handles invalid JSON in arguments
## Usage with Tools Parameter
To enable function calling, include the `tools` parameter in your request:
```json
{
"model": "kimi-k2",
"messages": [
{
"role": "user",
"content": "What's the weather in Tokyo?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}
]
}
```
## Model Compatibility
- **Kimi-K2 models**: Native support with token format
- **Qwen3 models**: Native support with XML format (Hermes-style)
- **DeepSeek R1 models**: Native support with reasoning and function call format (ported from original llama.cpp)
- **Other models**: No function calling support
## Testing
Comprehensive test suite for all supported formats:
### Unit Tests
- **File**: `tests/test-function-calls.cpp`
- **Coverage**: All supported model formats (Kimi-K2, Qwen3, DeepSeek R1)
- **Test Types**:
- Native format parsing for each model type
- Multiple function calls
- Error handling and malformed input
- Streaming and non-streaming responses
- Content extraction and cleaning
- OpenAI-compatible output generation
### DeepSeek R1 Specific Tests
- **Format 1 Tests**: Full native format with separators ✅
- **Format 2 Tests**: Simplified format without separators ✅
- **Format 3 Tests**: Tools array format ✅ (Fixed July 2025)
- **Format 4 Tests**: XML wrapped format ✅ (Added July 2025)
- **Integration Tests**: Server-to-parser call chain verification
- **Regression Tests**: Ensure existing formats continue working
### Running Tests
```bash
# Build tests
cd build && make test-function-calls -j$(nproc)
# Run all function call tests
./bin/test-function-calls
# Run DeepSeek R1 specific tests
./bin/test-function-calls | grep -E "(DeepSeek|tool_calls_count)"
# Check Format 3 specific issues
./bin/test-function-calls | grep -A5 -B5 "Real failing format"
```
### Test Status
- **Kimi-K2**: ✅ All tests passing
- **Qwen3 XML**: ✅ All tests passing
- **DeepSeek R1 Format 1 & 2**: ✅ All tests passing
- **DeepSeek R1 Format 3**: ✅ All tests passing (Fixed July 2025)
- **DeepSeek R1 Format 4**: ✅ All tests passing (Added July 2025)
## File Structure
### Server Integration
- **`examples/server/server.cpp`** - Main server entry point, calls `parse_chat_message_incremental()`
- **`examples/server/function_calls.hpp`** - Server-side parser creation and integration
- **`examples/server/utils.hpp`** - Server utilities (includes function_calls.hpp)
### Core Parsing Engine
- **`common/chat-parser.cpp`** - Main parser routing, delegates to model-specific parsers
- **`common/chat-parser.h`** - Parser interface and JSON parsing infrastructure
- **`common/chat.cpp`** - Model-specific parsing implementations:
- `common_chat_parse_kimi_k2()` - Kimi-K2 native format
- `common_chat_parse_qwen3()` - Qwen3 XML format
- `common_chat_parse_deepseek_r1()` - DeepSeek R1 multiple formats
- `parse_deepseek_r1_tools_array()` - Format 3 tools array parser
- `parse_deepseek_r1_xml_wrapped()` - Format 4 XML wrapper parser
- **`common/chat.h`** - Function declarations and model detection
### Testing
- **`tests/test-function-calls.cpp`** - Comprehensive unit tests for all formats
- **`tests/get-model.cpp`** - Test utilities for model loading
### Integration Flow
```
server.cpp:2832
↓ parse_chat_message_incremental(generated_text, false, modelname)
function_calls.hpp:94-95
↓ common_chat_msg_parser.parse()
chat-parser.cpp:140
↓ model detection → specific parser
chat.cpp
↓ common_chat_parse_deepseek_r1() / kimi_k2() / qwen3()
↓ Format detection → regex matching → JSON parsing → tool_calls array
```
### Key Implementation Files
- **DeepSeek R1 Format 3**: `common/chat.cpp:291-332` (`parse_deepseek_r1_tools_array`)
- **DeepSeek R1 Format 4**: `common/chat.cpp:335-374` (`parse_deepseek_r1_xml_wrapped`)
- **Exception handling**: `common/chat.cpp:222-289` (Format 3 → 4 → 1 → 2 fallback chain)
- **Model detection**: `common/chat.cpp` (`is_deepseek_r1_model`, `is_qwen3_model`, etc.)
- **Comprehensive tests**: `tests/test-function-calls.cpp` (All formats with TDD coverage)