tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-05-11 16:30:16 +00:00

Author	SHA1	Message	Date
turboderp	553c4e7cbb	Docker: Serve on 0.0.0.0 by default	2026-05-09 23:22:56 +02:00
Josh	09f36f9c05	fix: prevent xformers from pulling cu130 wheels on cu128 hosts (#420 ) The default `pip install .[cu12,extras]` lets pip resolve xformers transitively (via infinity-emb / sentence-transformers in the extras group), which can pull a cu130-aligned wheel that requires libcudart.so.13. On hosts with NVIDIA driver 590.x (cu128-only), this fails at import time with: ImportError: libcudart.so.13: cannot open shared object file Reproduced on K3s clusters running 12 exllamav2/exllamav3 deployment pods × 6 hosts; all crash-looped on the published `:latest` image which had transitively resolved xformers to a cu130 wheel. Fix: split the install into two pip invocations. Install the cu12 group first to lock torch + cu128 wheels for exllamav2 / exllamav3 / flash_attn, then install the extras group with --no-deps so pip cannot resolve xformers (or any other transitive dep) outside the cu128 lock. Also align the Windows py3.12 flash_attn wheel version to v0.7.13 to match the other Windows variants (py3.10, py3.11, py3.13). The py3.12 variant was pinned to v0.7.6 while the rest were on v0.7.13, leaving py3.12 Windows users on an older flash_attn release with no semantic reason for the divergence. Tested on Hydra K3s cluster (NVIDIA 590.48.01-open + cu128 base image nvidia/cuda:12.8.1-runtime-ubuntu24.04 + torch 2.9.0+cu128). All 12 exllamav2/v3 deployments now import cleanly and serve /v1/models. Co-authored-by: Josh Jones <scoobydont-666@users.noreply.github.com>	2026-05-09 21:21:17 +02:00
kingbri	30a3cd75cf	Start: Migrate options from cu121/118 to cu12 This encapsulates more cuda versions and makes install easier for new users. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-08-19 22:56:58 -04:00
kingbri	755f98a338	Docker: Move to venv for running Newer versions of Python don't allow system package installation unless --break-system-packages are specified. I'd like to avoid this if possible. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-04-27 00:38:07 -04:00
kingbri	f70eb11db3	Docker: Use python 3.12 Ubuntu 24.04 ships with 3.12 by default. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-04-27 00:24:32 -04:00
kingbri	09ddfa8ffb	Docker: Update to Cuda 12.8 and Ubuntu 24.04 Use more modern versions of dependencies for the containerized image. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-04-26 21:29:36 -04:00
kingbri	101ebd658a	Docker: Add extras to dockerfile Adds support for all features when pulling the image Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-15 18:16:48 -05:00
TerminalMan	4b11cabbec	debloat docker build	2024-09-08 00:02:00 +01:00
kingbri	98768bfa30	Docker: Re-add build block If a user wants to build from source, let them. But the default should fetch from the package registry. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-04 23:39:06 -04:00
TerminalMan	48d7674316	make docker-compose use prebuilt images - Docker compose uses the prebuilt images produced by the GitHub action added in `872eeed581`	2024-08-29 00:50:01 +01:00
TerminalMan	80198ca056	API: Add /v1/health endpoint (#178 ) * Add healthcheck - localhost only /healthcheck endpoint - cURL healthcheck in docker compose file * Update Healthcheck Response - change endpoint to /health - remove localhost restriction - add docstring * move healthcheck definition to top of the file - make the healthcheck show up first in the openAPI spec * Tree: Format	2024-08-27 21:37:41 -04:00
kingbri	ecaddec48a	Docker-compose: Add models to bind mounts At least one bind mount is required in the volumes YAML block otherwise the docker build fails. Models should be fine to default since it always exists. Signed-off-by: kingbri <bdashore3@proton.me>	2024-08-19 22:07:53 -04:00
Amgad Hasan	dae394050e	Improve docker deployment configuration (#163 )	2024-08-18 15:19:18 -04:00
Amgad Hasan	2e5cf0ea3f	Fix docker compose volume mount	2024-07-12 13:23:58 +00:00
kingbri	65871ebc0c	Docker: Add var to pull on build When building the Docker container, try pulling from the github repository to get the latest commit. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-19 21:06:34 -04:00
kingbri	209f0370b4	Docker: Switch image and copy config Automatically create a config.yml on build. Also use the cuda runtime image which is much lighter than the previous cuda devel image. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-15 18:01:56 -04:00
PΔBLØ ᄃΞ	8a5a82baec	Update Dockerfile remove unnecessary apt install to just use one	2024-04-01 22:27:11 -05:00
PΔBLØ ᄃΞ	85271e2b7d	fix: Dockerfile work on pyproject.toml	2024-04-01 19:32:42 -05:00
Martin Honermeyer	4afb4137f7	Remove explicit pytorch & exllamav2 in Dockerfile These packages are already installed via requirements.txt.	2024-02-25 18:03:01 +01:00
kingbri	78f920eeda	Tree: Refactor code organization Move common functions into their own folder and refactor the backends to use their own folder as well. Also cleanup imports and alphabetize import statments themselves. Finally, move colab and docker into their own folders as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00

20 Commits