mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-07-14 11:07:10 +00:00

Go to file

kingbri b0c295dd2f API: Add more methods to semaphore

The semaphore/queue model for Tabby is as follows:
- Any load requests go through the semaphore by default
- Any load request can include the skip_queue parameter to bypass
the semaphore
- Any unload requests are immediately executed
- All completion requests are placed inside the semaphore by default

This model preserves the parallelism of single-user mode with extra
convenience methods for queues in multi-user. It also helps mitigate
problems that were previously present in the concurrency stack.

Also change how the program's loop runs so it exits when the API thread
dies.

Signed-off-by: kingbri <bdashore3@proton.me>

2024-03-04 23:21:40 -05:00

.github

Create pull request template

2024-02-09 14:53:29 -05:00

backends/exllamav2

API: Add more methods to semaphore

2024-03-04 23:21:40 -05:00

colab

Tree: Refactor code organization

2024-01-25 00:15:40 -05:00

common

API: Add more methods to semaphore

2024-03-04 23:21:40 -05:00

docker

Tree: Refactor code organization

2024-01-25 00:15:40 -05:00

loras

Implement lora support (#24 )

2023-12-08 23:38:08 -05:00

models

Tree: Update documentation and configs

2023-11-16 02:30:33 -05:00

OAI

API: Add more methods to semaphore

2024-03-04 23:21:40 -05:00

sampler_overrides

Neutralize samplers (#59 )

2024-02-08 00:23:09 -05:00

templates

Templates: Update folder

2023-12-18 23:53:47 -05:00

tests

Tree: Refactor code organization

2024-01-25 00:15:40 -05:00

.gitignore

Tree: Unify sampler parameters and add override support

2024-01-25 00:15:40 -05:00

.ruff.toml

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

config_sample.yml

Additional clarification for override_base_seq_len

2024-03-02 09:29:50 -08:00

formatting.bat

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

formatting.sh

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

LICENSE

Create LICENSE

2023-11-16 17:43:23 -05:00

main.py

API: Add more methods to semaphore

2024-03-04 23:21:40 -05:00

README.md

Update README

2024-02-20 00:19:31 -05:00

requirements-amd.txt

Requirements: Bump ExllamaV2

2024-02-24 12:26:08 -05:00

requirements-cu118.txt

Requirements: Bump ExllamaV2

2024-02-24 12:26:08 -05:00

requirements-dev.txt

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

requirements-nowheel.txt

feat: logging (#39 )

2023-12-23 04:33:31 +00:00

requirements.txt

Requirements: Bump ExllamaV2

2024-02-24 12:26:08 -05:00

start.bat

Tree: Format and cleanup start

2023-12-27 01:17:31 -05:00

start.py

Tree: Refactor code organization

2024-01-25 00:15:40 -05:00

start.sh

Start: Add shell script

2023-12-27 23:53:14 -05:00

README.md

TabbyAPI

Important

In addition to the README, please read the Wiki page for information about getting started!

Note

Need help? Join the Discord Server and get the Tabby role. Please be nice when asking questions.

A FastAPI based application that allows for generating text using an LLM (large language model) using the Exllamav2 backend

Disclaimer

This API is considered as rolling release. There may be bugs and changes down the line. Please be aware that you might need to reinstall dependencies if needed.

Getting Started

Read the Wiki for more information. It contains user-facing documentation for installation, configuration, sampling, API usage, and so much more.

Supported Model Types

TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:

Exl2 (Highly recommended)
GPTQ
FP16 (using Exllamav2's loader)

Alternative Loaders/Backends

If you want to use a different model type than the ones listed above, here are some alternative backends with their own APIs:

Contributing

If you have issues with the project:

Describe the issues in detail
If you have a feature request, please indicate it as such.

If you have a Pull Request

Describe the pull request in detail, what, and why you are changing something

Developers and Permissions

Creators/Developers:

Languages

Python 95.7%

Jupyter Notebook 2.3%

Shell 0.7%

Batchfile 0.7%

Dockerfile 0.3%

Other 0.3%