tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-03-14 15:57:27 +00:00

Files

kingbri 2096c9bad2 Model: Default max_seq_len to 4096

A common problem in TabbyAPI is that users who want to get up and
running with a model always had issues with max_seq_len causing OOMs.
This is because model devs set max context values in the millions which
requires a lot of VRAM.

To idiot-proof first time setup, make the fallback default 4096 so
users can run their models. If a user still wants to use the model's
max_seq_len, set it to -1.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>

2025-06-13 14:57:24 -04:00

grammar.py

Tree: Format

2025-05-17 00:46:40 -04:00

model.py

Model: Default max_seq_len to 4096

2025-06-13 14:57:24 -04:00

utils.py

Exl3: Add chunk size, cache size, and model info

2025-05-02 21:33:25 -04:00

vision.py

Dependencies: Fix OpenAPI generation

2024-11-22 17:59:20 -05:00