tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-05-11 16:30:16 +00:00

Files

turboderp 179479199b Rework tool calls and OAI chat completions

- move tool config from template_vars to separate yml config
- new per-gen stream collector used for both streaming and non-streaming requests to ensure logic is consistent for both
- move responsibility for switching between phases to stream collector
- collect tool calls during streaming and parse at the end of each gen
- prevent streaming empty content spans (be nice to clients)
- correctly aggregate usage stats for n>1 requests, always emit with last chunk in last gen to finish
- collect logprobs in model wrapper and correctly handle logprobs for multi-token chars etc.
- respect top_logprobs argument in request
- handle a number of edge cases like <think> tag being part of held string, etc.
- retain tool parsing and inference-abort fixes from #413, apply similar fix to non-stream request as well

Still TODO:
- testing and validation with more models and tool schemas (tested on Qwen so far)
- enable JSON constraint for JSON tool models
- possibly some pydantification
- documentation

2026-03-30 00:22:55 +02:00

alpaca.jinja

Templates: Remove whitespace from metadata

2024-09-08 12:36:36 -04:00

chatml.jinja

Templates: Remove whitespace from metadata

2024-09-08 12:36:36 -04:00

place_your_templates_here.txt

Templates: Update folder

2023-12-18 23:53:47 -05:00