mirror of
https://github.com/theroyallab/tabbyAPI.git
synced 2026-05-24 14:44:41 +00:00
58 lines
2.4 KiB
Markdown
58 lines
2.4 KiB
Markdown
# Tool Calling in TabbyAPI
|
|
|
|
Tool calling is available for supported models, and enabled by selecting a tool format in the model
|
|
config. This can also be specifed per model using `tabby_config.yml`.
|
|
|
|
Most tool-calling models are also reasoning models and it is recommended to enable reasoning as
|
|
well, with appropriate reasoning tags (these cannot currently be inferred from the model's template).
|
|
|
|
```yml
|
|
model:
|
|
reasoning: true
|
|
reasoning_start_token: "<think>"
|
|
reasoning_end_token: "</think>"
|
|
tool_format: qwen3_5
|
|
```
|
|
|
|
### Supported formats
|
|
Below are the currently recognized formats
|
|
|
|
| tool_format | Aliases | Model types
|
|
|---------------|-------------------|---------------------------
|
|
| qwen3_coder | qwen3_5, step3_5 | Qwen3-Coder<br> Qwen3-Next<br> Qwen3.5
|
|
| minimax_m2 | | Minimax-M2<br> Minimax-M2.1<br> Minimax-M2.5
|
|
| glm4_5 | glm4_6<br> glm4_7 | GLM4.5<br> GLM4.6<br> GLM4.7
|
|
| mistral_old ¹ | | (older Mistral-family models)
|
|
| mistral | | Codestral 2508+<br> Devstral-Small 2507+<br> Magistral-Medium 2506+<br> Magistral-Small 2506+<br> Ministral-3 2512+<br> Mistral-Medium-3.1 2508+<br> Mistral-Small-3.2 2506+<br>
|
|
| gemma4 | | Gemma 4-it
|
|
|
|
**¹** Older Mistral models tend to have unreliable tool calling support and even newer ones are
|
|
often released without official chat templates or with templates that omit any tool formatting.
|
|
Tokenization also changes frequently between model releases. YMMV
|
|
|
|
# Clients
|
|
|
|
TabbyAPI should support any software that uses the OAI tool calling API. But the standard is
|
|
evolving, no two clients can agree on *exactly* what it looks like and models are trained with
|
|
different assumptions as well. Below will be collected notes pertaining to various client software
|
|
and how it relates to TabbyAPI's tool calling support.
|
|
|
|
### OpenCode
|
|
|
|
- OpenCode by default forces categorical sampling, overriding TabbyAPI's defaults with top-P = 1.0.
|
|
This confuses some models, so if you're experiencing *occasional* random gibberish in your output,
|
|
check your OpenCode config to make sure sampling is configured there, e.g.:
|
|
|
|
```json
|
|
"agent": {
|
|
"build": {
|
|
"top_p": 0.8
|
|
},
|
|
"plan": {
|
|
"top_p": 0.8
|
|
}
|
|
}
|
|
```
|
|
|
|
- OpenCode doesn't explicitly enable reasoning in the request by default. For some models this doesn't
|
|
matter. For others (e.g. Gemma4) you can configure `force_enable_thinking: true` in TabbyAPI. |