Files
ik_llama.cpp/docs/function-calling.md
firecoperana ab1d74074b common : introduce composable PEG parser combinators for chat parsing and new jinja template engine (#1369)
---------

Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>

common : add nemotron 3 parsing (#18077)

common : add parser for ministral/mistral large 3/devstral 2 (#17713)

common : default content to an empty string (#18485)

chat: make tool description and parameters optional per OpenAI spec (#18478)

Per the OpenAI API specification, both 'description' and 'parameters'
fields in tool function definitions are optional. Previously, the parser
would throw an exception if these fields were missing.

Attempts to fix #17667

common : implement new jinja template engine (#18462)
---------

Co-authored-by: Alde Rojas <hello@alde.dev>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

jinja: correct member access rule (#18905)

jinja : fix lexing of float literals with sign (#18901)

jinja : add missing tojson filter for bool (#18900)

jinja : attribute support for join, map and sort (#18883)

jinja : fix object item order (and properly implement dictsort) (#18904)

tests : add test-jinja -py option for cross-checking (#18906)

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

ci : run test-jinja -py on high perf [no ci] (#18916)

jinja : fix undefined keys and attributes and int/float as bool (#18924)

jinja: support none|string (#18995)

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

jinja : implement mixed type object keys (#18955)

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests (#19147)

`tojson` is not a supported `undefined` filter

keep it DRY and fix some types

jinja : do not pass empty tools and add some none filters (#19176)

jinja : add unordered_map include to value.h [no ci] (#19205)

jinja : add missing 'in' test to template engine (#19004) (#19239)

The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".

This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.

Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.

Includes test cases for all three containment types plus
reject/select filter usage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

Add Jinja support for "indent" string filter (#19529)

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

add vendor

refactor chat

server : support preserving reasoning_content in assistant message (#18994)

chat : fix translategemma crash on common_chat_format_example (#19019)

chat: fix language input for translategemma (#19052)

Co-authored-by: Aldehir Rojas <hello@alde.dev>

---------

Co-authored-by: Aldehir Rojas <hello@alde.dev>

chat: fix case where template accepts type content only (#19419)

mtmd : chat : Fix extra \n between text and media marker (#19595)

Thanks to @tugot17 for detecting and reporting the issue.

For vision models (e.g. LFM2.5-VL-1.6B and Qwen/Qwen3-VL-4B-Instruct) `llama-mtmd-cli` produces identical output to HF implementation.

However `llama-server` doesn't. I traced it down to extra newline
inserted after `<__media__>`.

This happens in `to_json_oaicompat`, that treats media markers as text
and joins all parts with `\n` separator.

PR introduces new type `media_marker` and uses it for media markers.
Extra logic is added to prevent insertion of newlines before and after
media markers.

With this change number of input tokens is identical to HF
implementation and as a result the output is also identical.

I explored other ways to address the issue
* remove completely `\n` between text parts in `to_json_oaicompat`
* merge text messages in server-common.cpp before sending them to `to_json_oaicompat`

Please propose alternative ways of fixing this issue.

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>

---------

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>

common : merge qwen3-coder and nemotron nano 3 parsers (#19765)

common : fix improper trimming in XML parser on complete message (#19805)

Co-authored-by: Jules LEIDELINGER <11395311+julio75012@users.noreply.github.com>

jinja: correct stats for tojson and string filters (#19785)

jinja : correct default size for string slices (#19913)

common : handle unicode during partial json parsing (#16526)

common : fix json schema with '\' in literals (#17307)

add back qwen_coder_xml and mirothinker

Co-authored-by: Aldehir Rojas <hello@alde.dev>
2026-03-09 11:03:33 +01:00

20 KiB

Function Calling

chat.h (https://github.com/ggml-org/llama.cpp/pull/9639) adds support for OpenAI-style function calling and is used in:

  • llama-server when started w/ --jinja flag

Universal support w/ Native & Generic handlers

Function calling is supported for all models (see https://github.com/ggml-org/llama.cpp/pull/9639):

  • Native tool call formats supported:

    • Llama 3.1 / 3.3 (including builtin tools support - tool names for wolfram_alpha, web_search / brave_search, code_interpreter), Llama 3.2
    • Functionary v3.1 / v3.2
    • Hermes 2/3, Qwen 2.5
    • Qwen 2.5 Coder
    • Mistral Nemo
    • Firefunction v2
    • Command R7B
    • DeepSeek R1 (WIP / seems reluctant to call any tools?)
  • Generic tool call is supported when the template isn't recognized by native format handlers (you'll see Chat format: Generic in the logs).

    • Use --chat-template-file to override the template when appropriate (see examples below)
    • Generic support may consume more tokens and be less efficient than a model's native format.
Show some common templates and which format handler they use
Template Format
Almawave-Velvet-14B.jinja Hermes 2 Pro
AtlaAI-Selene-1-Mini-Llama-3.1-8B.jinja Llama 3.x
CohereForAI-aya-expanse-8b.jinja Generic
CohereForAI-c4ai-command-r-plus-default.jinja Generic
CohereForAI-c4ai-command-r-plus-rag.jinja Generic
CohereForAI-c4ai-command-r-plus-tool_use.jinja Generic
CohereForAI-c4ai-command-r7b-12-2024-default.jinja Command R7B (extract reasoning)
CohereForAI-c4ai-command-r7b-12-2024-rag.jinja Command R7B (extract reasoning)
CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja Command R7B (extract reasoning)
CohereForAI-c4ai-command-r7b-12-2024.jinja Generic
DavieLion-Llama-3.2-1B-SPIN-iter3.jinja Generic
Delta-Vector-Rei-12B.jinja Mistral Nemo
EpistemeAI-Mistral-Nemo-Instruct-12B-Philosophy-Math.jinja Mistral Nemo
FlofloB-83k_continued_pretraining_Qwen2.5-0.5B-Instruct_Unsloth_merged_16bit.jinja Hermes 2 Pro
FlofloB-test_continued_pretraining_Phi-3-mini-4k-instruct_Unsloth_merged_16bit.jinja Generic
HelpingAI-HAI-SER.jinja Generic
HuggingFaceTB-SmolLM2-1.7B-Instruct.jinja Generic
HuggingFaceTB-SmolLM2-135M-Instruct.jinja Generic
HuggingFaceTB-SmolLM2-360M-Instruct.jinja Generic
INSAIT-Institute-BgGPT-Gemma-2-27B-IT-v1.0.jinja Generic
Ihor-Text2Graph-R1-Qwen2.5-0.5b.jinja Hermes 2 Pro
Infinigence-Megrez-3B-Instruct.jinja Generic
Josephgflowers-TinyLlama_v1.1_math_code-world-test-1.jinja Generic
LGAI-EXAONE-EXAONE-3.5-2.4B-Instruct.jinja Generic
LGAI-EXAONE-EXAONE-3.5-7.8B-Instruct.jinja Generic
LatitudeGames-Wayfarer-12B.jinja Generic
Magpie-Align-Llama-3-8B-Magpie-Align-v0.1.jinja Generic
Magpie-Align-Llama-3.1-8B-Magpie-Align-v0.1.jinja Generic
MaziyarPanahi-calme-3.2-instruct-78b.jinja Generic
MiniMaxAI-MiniMax-Text-01.jinja Generic
MiniMaxAI-MiniMax-VL-01.jinja Generic
NaniDAO-deepseek-r1-qwen-2.5-32B-ablated.jinja DeepSeek R1 (extract reasoning)
NexaAIDev-Octopus-v2.jinja Generic
NousResearch-Hermes-2-Pro-Llama-3-8B-default.jinja Generic
NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja Hermes 2 Pro
NousResearch-Hermes-2-Pro-Mistral-7B-default.jinja Generic
NousResearch-Hermes-2-Pro-Mistral-7B-tool_use.jinja Hermes 2 Pro
NousResearch-Hermes-3-Llama-3.1-70B-default.jinja Generic
NousResearch-Hermes-3-Llama-3.1-70B-tool_use.jinja Hermes 2 Pro
NovaSky-AI-Sky-T1-32B-Flash.jinja Hermes 2 Pro
NovaSky-AI-Sky-T1-32B-Preview.jinja Hermes 2 Pro
OnlyCheeini-greesychat-turbo.jinja Generic
Orenguteng-Llama-3.1-8B-Lexi-Uncensored-V2.jinja Llama 3.x
OrionStarAI-Orion-14B-Chat.jinja Generic
PowerInfer-SmallThinker-3B-Preview.jinja Generic
PrimeIntellect-INTELLECT-1-Instruct.jinja Generic
Qwen-QVQ-72B-Preview.jinja Generic
Qwen-QwQ-32B-Preview.jinja Hermes 2 Pro
Qwen-Qwen1.5-7B-Chat.jinja Generic
Qwen-Qwen2-7B-Instruct.jinja Generic
Qwen-Qwen2-VL-72B-Instruct.jinja Generic
Qwen-Qwen2-VL-7B-Instruct.jinja Generic
Qwen-Qwen2.5-0.5B.jinja Hermes 2 Pro
Qwen-Qwen2.5-1.5B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-14B-Instruct-1M.jinja Hermes 2 Pro
Qwen-Qwen2.5-14B.jinja Hermes 2 Pro
Qwen-Qwen2.5-32B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-32B.jinja Hermes 2 Pro
Qwen-Qwen2.5-3B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-72B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-7B-Instruct-1M.jinja Hermes 2 Pro
Qwen-Qwen2.5-7B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-7B.jinja Hermes 2 Pro
Qwen-Qwen2.5-Coder-32B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-Coder-7B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-Math-1.5B.jinja Hermes 2 Pro
Qwen-Qwen2.5-Math-7B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-VL-3B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-VL-72B-Instruct.jinja Hermes 2 Pro
Qwen-Qwen2.5-VL-7B-Instruct.jinja Hermes 2 Pro
RWKV-Red-Team-ARWKV-7B-Preview-0.1.jinja Hermes 2 Pro
SakanaAI-TinySwallow-1.5B-Instruct.jinja Hermes 2 Pro
SakanaAI-TinySwallow-1.5B.jinja Hermes 2 Pro
Sao10K-70B-L3.3-Cirrus-x1.jinja Llama 3.x
SentientAGI-Dobby-Mini-Leashed-Llama-3.1-8B.jinja Llama 3.x
SentientAGI-Dobby-Mini-Unhinged-Llama-3.1-8B.jinja Llama 3.x
Steelskull-L3.3-Damascus-R1.jinja Llama 3.x
Steelskull-L3.3-MS-Nevoria-70b.jinja Llama 3.x
Steelskull-L3.3-Nevoria-R1-70b.jinja Llama 3.x
THUDM-glm-4-9b-chat.jinja Generic
THUDM-glm-edge-1.5b-chat.jinja Generic
Tarek07-Progenitor-V1.1-LLaMa-70B.jinja Llama 3.x
TheBloke-FusionNet_34Bx2_MoE-AWQ.jinja Generic
TinyLlama-TinyLlama-1.1B-Chat-v1.0.jinja Generic
UCLA-AGI-Mistral7B-PairRM-SPPO-Iter3.jinja Generic
ValiantLabs-Llama3.1-8B-Enigma.jinja Llama 3.x
abacusai-Fewshot-Metamath-OrcaVicuna-Mistral.jinja Generic
ai21labs-AI21-Jamba-1.5-Large.jinja Generic
allenai-Llama-3.1-Tulu-3-405B-SFT.jinja Generic
allenai-Llama-3.1-Tulu-3-405B.jinja Generic
allenai-Llama-3.1-Tulu-3-8B.jinja Generic
arcee-ai-Virtuoso-Lite.jinja Hermes 2 Pro
arcee-ai-Virtuoso-Medium-v2.jinja Hermes 2 Pro
arcee-ai-Virtuoso-Small-v2.jinja Hermes 2 Pro
avemio-GRAG-NEMO-12B-ORPO-HESSIAN-AI.jinja Generic
bespokelabs-Bespoke-Stratos-7B.jinja Hermes 2 Pro
bfuzzy1-acheron-m1a-llama.jinja Generic
bofenghuang-vigogne-2-70b-chat.jinja Generic
bytedance-research-UI-TARS-72B-DPO.jinja Generic
bytedance-research-UI-TARS-7B-DPO.jinja Generic
bytedance-research-UI-TARS-7B-SFT.jinja Generic
carsenk-phi3.5_mini_exp_825_uncensored.jinja Generic
cyberagent-DeepSeek-R1-Distill-Qwen-14B-Japanese.jinja DeepSeek R1 (extract reasoning)
cyberagent-DeepSeek-R1-Distill-Qwen-32B-Japanese.jinja DeepSeek R1 (extract reasoning)
databricks-dbrx-instruct.jinja Generic
deepseek-ai-DeepSeek-Coder-V2-Instruct.jinja Generic
deepseek-ai-DeepSeek-Coder-V2-Lite-Base.jinja Generic
deepseek-ai-DeepSeek-Coder-V2-Lite-Instruct.jinja Generic
deepseek-ai-DeepSeek-R1-Distill-Llama-70B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Distill-Llama-8B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Distill-Qwen-14B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Distill-Qwen-7B.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1-Zero.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-R1.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-V2-Lite.jinja Generic
deepseek-ai-DeepSeek-V2.5.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-DeepSeek-V3.jinja DeepSeek R1 (extract reasoning)
deepseek-ai-deepseek-coder-33b-instruct.jinja Generic
deepseek-ai-deepseek-coder-6.7b-instruct.jinja Generic
deepseek-ai-deepseek-coder-7b-instruct-v1.5.jinja Generic
deepseek-ai-deepseek-llm-67b-chat.jinja Generic
deepseek-ai-deepseek-llm-7b-chat.jinja Generic
dicta-il-dictalm2.0-instruct.jinja Generic
ehristoforu-Falcon3-8B-Franken-Basestruct.jinja Hermes 2 Pro
fireworks-ai-llama-3-firefunction-v2.jinja FireFunction v2
godlikehhd-alpaca_data_sampled_ifd_new_5200.jinja Hermes 2 Pro
godlikehhd-alpaca_data_score_max_0.7_2600.jinja Hermes 2 Pro
google-gemma-2-27b-it.jinja Generic
google-gemma-2-2b-it.jinja Generic
google-gemma-2-2b-jpn-it.jinja Generic
google-gemma-7b-it.jinja Generic
huihui-ai-DeepSeek-R1-Distill-Llama-70B-abliterated.jinja DeepSeek R1 (extract reasoning)
huihui-ai-DeepSeek-R1-Distill-Llama-8B-abliterated.jinja DeepSeek R1 (extract reasoning)
huihui-ai-DeepSeek-R1-Distill-Qwen-14B-abliterated-v2.jinja DeepSeek R1 (extract reasoning)
huihui-ai-DeepSeek-R1-Distill-Qwen-32B-abliterated.jinja DeepSeek R1 (extract reasoning)
huihui-ai-DeepSeek-R1-Distill-Qwen-7B-abliterated-v2.jinja DeepSeek R1 (extract reasoning)
huihui-ai-Qwen2.5-14B-Instruct-1M-abliterated.jinja Hermes 2 Pro
ibm-granite-granite-3.1-8b-instruct.jinja Generic
indischepartij-MiniCPM-3B-OpenHermes-2.5-v2.jinja Generic
inflatebot-MN-12B-Mag-Mell-R1.jinja Generic
jinaai-ReaderLM-v2.jinja Generic
kms7530-chemeng_qwen-math-7b_24_1_100_1_nonmath.jinja Hermes 2 Pro
knifeayumu-Cydonia-v1.3-Magnum-v4-22B.jinja Mistral Nemo
langgptai-qwen1.5-7b-chat-sa-v0.1.jinja Generic
lightblue-DeepSeek-R1-Distill-Qwen-7B-Japanese.jinja DeepSeek R1 (extract reasoning)
mattshumer-Reflection-Llama-3.1-70B.jinja Generic
meetkai-functionary-medium-v3.1.jinja Functionary v3.1 Llama 3.1
meetkai-functionary-medium-v3.2.jinja Functionary v3.2
meta-llama-Llama-2-7b-chat-hf.jinja Generic
meta-llama-Llama-3.1-8B-Instruct.jinja Llama 3.x
meta-llama-Llama-3.2-11B-Vision-Instruct.jinja Llama 3.x
meta-llama-Llama-3.2-1B-Instruct.jinja Llama 3.x
meta-llama-Llama-3.2-3B-Instruct.jinja Llama 3.x
meta-llama-Llama-3.3-70B-Instruct.jinja Llama 3.x
meta-llama-Meta-Llama-3-8B-Instruct.jinja Generic
meta-llama-Meta-Llama-3.1-8B-Instruct.jinja Llama 3.x
microsoft-Phi-3-medium-4k-instruct.jinja Generic
microsoft-Phi-3-mini-4k-instruct.jinja Generic
microsoft-Phi-3-small-8k-instruct.jinja Generic
microsoft-Phi-3.5-mini-instruct.jinja Generic
microsoft-Phi-3.5-vision-instruct.jinja Generic
microsoft-phi-4.jinja Generic
migtissera-Tess-3-Mistral-Nemo-12B.jinja Generic
ministral-Ministral-3b-instruct.jinja Generic
mistralai-Codestral-22B-v0.1.jinja Generic
mistralai-Mistral-7B-Instruct-v0.1.jinja Generic
mistralai-Mistral-7B-Instruct-v0.2.jinja Generic
mistralai-Mistral-7B-Instruct-v0.3.jinja Mistral Nemo
mistralai-Mistral-Large-Instruct-2407.jinja Mistral Nemo
mistralai-Mistral-Large-Instruct-2411.jinja Generic
mistralai-Mistral-Nemo-Instruct-2407.jinja Mistral Nemo
mistralai-Mistral-Small-24B-Instruct-2501.jinja Generic
mistralai-Mixtral-8x7B-Instruct-v0.1.jinja Generic
mkurman-Qwen2.5-14B-DeepSeek-R1-1M.jinja Hermes 2 Pro
mlabonne-AlphaMonarch-7B.jinja Generic
mlx-community-Josiefied-Qwen2.5-0.5B-Instruct-abliterated-v1-float32.jinja Hermes 2 Pro
mlx-community-Qwen2.5-VL-7B-Instruct-8bit.jinja Hermes 2 Pro
mobiuslabsgmbh-DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1.jinja DeepSeek R1 (extract reasoning)
netcat420-MFANNv0.20.jinja Generic
netcat420-MFANNv0.24.jinja Generic
netease-youdao-Confucius-o1-14B.jinja Hermes 2 Pro
nvidia-AceMath-7B-RM.jinja Hermes 2 Pro
nvidia-Eagle2-1B.jinja Hermes 2 Pro
nvidia-Eagle2-9B.jinja Hermes 2 Pro
nvidia-Llama-3.1-Nemotron-70B-Instruct-HF.jinja Llama 3.x
onnx-community-DeepSeek-R1-Distill-Qwen-1.5B-ONNX.jinja DeepSeek R1 (extract reasoning)
open-thoughts-OpenThinker-7B.jinja Hermes 2 Pro
openchat-openchat-3.5-0106.jinja Generic
pankajmathur-orca_mini_v6_8b.jinja Generic
princeton-nlp-Mistral-7B-Base-SFT-RDPO.jinja Generic
princeton-nlp-Mistral-7B-Instruct-DPO.jinja Generic
princeton-nlp-Mistral-7B-Instruct-RDPO.jinja Generic
prithivMLmods-Bellatrix-Tiny-1.5B-R1.jinja Hermes 2 Pro
prithivMLmods-Bellatrix-Tiny-1B-R1.jinja Llama 3.x
prithivMLmods-Bellatrix-Tiny-1B-v3.jinja Generic
prithivMLmods-Bellatrix-Tiny-3B-R1.jinja Llama 3.x
prithivMLmods-Blaze-14B-xElite.jinja Generic
prithivMLmods-Calcium-Opus-14B-Elite2-R1.jinja Hermes 2 Pro
prithivMLmods-Calme-Ties-78B.jinja Generic
prithivMLmods-Calme-Ties2-78B.jinja Generic
prithivMLmods-Calme-Ties3-78B.jinja Generic
prithivMLmods-ChemQwen2-vL.jinja Generic
prithivMLmods-GWQ2b.jinja Generic
prithivMLmods-LatexMind-2B-Codec.jinja Generic
prithivMLmods-Llama-3.2-6B-AlgoCode.jinja Llama 3.x
prithivMLmods-Megatron-Opus-14B-Exp.jinja Hermes 2 Pro
prithivMLmods-Megatron-Opus-14B-Stock.jinja Hermes 2 Pro
prithivMLmods-Megatron-Opus-7B-Exp.jinja Hermes 2 Pro
prithivMLmods-Omni-Reasoner-Merged.jinja Hermes 2 Pro
prithivMLmods-Omni-Reasoner4-Merged.jinja Hermes 2 Pro
prithivMLmods-Primal-Opus-14B-Optimus-v1.jinja Hermes 2 Pro
prithivMLmods-QwQ-Math-IO-500M.jinja Hermes 2 Pro
prithivMLmods-Qwen-7B-Distill-Reasoner.jinja DeepSeek R1 (extract reasoning)
prithivMLmods-Qwen2.5-1.5B-DeepSeek-R1-Instruct.jinja Hermes 2 Pro
prithivMLmods-Qwen2.5-14B-DeepSeek-R1-1M.jinja Hermes 2 Pro
prithivMLmods-Qwen2.5-32B-DeepSeek-R1-Instruct.jinja Hermes 2 Pro
prithivMLmods-Qwen2.5-7B-DeepSeek-R1-1M.jinja Hermes 2 Pro
prithivMLmods-Triangulum-v2-10B.jinja Hermes 2 Pro
qingy2024-Falcon3-2x10B-MoE-Instruct.jinja Hermes 2 Pro
rubenroy-Zurich-14B-GCv2-5m.jinja Hermes 2 Pro
rubenroy-Zurich-7B-GCv2-5m.jinja Hermes 2 Pro
silma-ai-SILMA-Kashif-2B-Instruct-v1.0.jinja Generic
simplescaling-s1-32B.jinja Hermes 2 Pro
sometimesanotion-Lamarck-14B-v0.7.jinja Hermes 2 Pro
sonthenguyen-zephyr-sft-bnb-4bit-DPO-mtbr-180steps.jinja Generic
sthenno-tempesthenno-icy-0130.jinja Generic
sumink-qwft.jinja Hermes 2 Pro
teknium-OpenHermes-2.5-Mistral-7B.jinja Generic
thirdeyeai-elevate360m.jinja Generic
tiiuae-Falcon3-10B-Instruct.jinja Hermes 2 Pro
unsloth-DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit.jinja DeepSeek R1 (extract reasoning)
unsloth-DeepSeek-R1-Distill-Llama-8B.jinja DeepSeek R1 (extract reasoning)
unsloth-DeepSeek-R1.jinja DeepSeek R1 (extract reasoning)
unsloth-Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit.jinja Generic
upstage-solar-pro-preview-instruct.jinja Generic
whyhow-ai-PatientSeek.jinja Generic
xwen-team-Xwen-72B-Chat.jinja Hermes 2 Pro
xwen-team-Xwen-7B-Chat.jinja Hermes 2 Pro

This table can be generated with:

./build/bin/test-chat ../minja/build/tests/*.jinja 2>/dev/null

Usage - need tool-aware Jinja template

First, start a server with any model, but make sure it has a tools-enabled template: you can verify this by inspecting the chat_template or chat_template_tool_use properties in http://localhost:8080/props).

Here are some models known to work (w/ chat template override when needed):

# Native support:

llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L
llama-server --jinja -fa -hf bartowski/Llama-3.3-70B-Instruct-GGUF:Q4_K_M

# Native support for DeepSeek R1 works best w/ our template override (official template is buggy, although we do work around it)

llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF:Q6_K_L \
    --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja

llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF:Q4_K_M \
    --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja

# Native support requires the right template for these GGUFs:

llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M
    --chat-template-file models/templates/meetkai-functionary-medium-v3.2.jinja

llama-server --jinja -fa -hf bartowski/Hermes-2-Pro-Llama-3-8B-GGUF:Q4_K_M \
    --chat-template-file models/templates/NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja

llama-server --jinja -fa -hf bartowski/Hermes-3-Llama-3.1-8B-GGUF:Q4_K_M \
    --chat-template-file models/templates/NousResearch-Hermes-3-Llama-3.1-8B-tool_use.jinja

llama-server --jinja -fa -hf bartowski/firefunction-v2-GGUF -hff firefunction-v2-IQ1_M.gguf \
    --chat-template-file models/templates/fireworks-ai-llama-3-firefunction-v2.jinja

llama-server --jinja -fa -hf bartowski/c4ai-command-r7b-12-2024-GGUF:Q6_K_L \
    --chat-template-file models/templates/CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja

# Generic format support
llama-server --jinja -fa -hf bartowski/phi-4-GGUF:Q4_0
llama-server --jinja -fa -hf bartowski/gemma-2-2b-it-GGUF:Q8_0
llama-server --jinja -fa -hf bartowski/c4ai-command-r-v01-GGUF:Q2_K

To get the official template from original HuggingFace repos, you can use scripts/get_chat_template.py (see examples invocations in models/templates/README.md)

Tip

If there is no official tool_use Jinja template, you may want to set --chat-template chatml to use a default that works with many models (YMMV!), or write your own (e.g. we provide a custom llama-cpp-deepseek-r1.jinja for DeepSeek R1 distills)

Caution

Beware of extreme KV quantizations (e.g. -ctk q4_0), they can substantially degrade the model's tool calling performance.

Test in CLI (or with any library / software that can use OpenAI-compatible API backends):

curl http://localhost:8080/v1/chat/completions -d '{
    "model": "gpt-3.5-turbo",
    "tools": [
        {
        "type":"function",
        "function":{
            "name":"python",
            "description":"Runs code in an ipython interpreter and returns the result of the execution after 60 seconds.",
            "parameters":{
            "type":"object",
            "properties":{
                "code":{
                "type":"string",
                "description":"The code to run in the ipython interpreter."
                }
            },
            "required":["code"]
            }
        }
        }
    ],
    "messages": [
        {
        "role": "user",
        "content": "Print a hello world message with python."
        }
    ]
}'


curl http://localhost:8080/v1/chat/completions -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
        {"role": "system", "content": "You are a chatbot that uses tools/functions. Dont overthink things."},
        {"role": "user", "content": "What is the weather in Istanbul?"}
    ],
    "tools": [{
        "type":"function",
        "function":{
            "name":"get_current_weather",
            "description":"Get the current weather in a given location",
            "parameters":{
                "type":"object",
                "properties":{
                    "location":{
                        "type":"string",
                        "description":"The city and country/state, e.g. `San Francisco, CA`, or `Paris, France`"
                    }
                },
                "required":["location"]
            }
        }
    }]
}'
Show output
{
"choices": [
    {
    "finish_reason": "tool",
    "index": 0,
    "message": {
        "content": null,
        "tool_calls": [
        {
            "name": "python",
            "arguments": "{\"code\":\" \\nprint(\\\"Hello, World!\\\")\"}"
        }
        ],
        "role": "assistant"
    }
    }
],
"created": 1727287211,
"model": "gpt-3.5-turbo",
"object": "chat.completion",
"usage": {
    "completion_tokens": 16,
    "prompt_tokens": 44,
    "total_tokens": 60
},
"id": "chatcmpl-Htbgh9feMmGM0LEH2hmQvwsCxq3c6Ni8"
}