mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-04-19 22:08:59 +00:00

Go to file

kingbri 7fded4f183 Tree: Switch to async generators

Async generation helps remove many roadblocks to managing tasks
using threads. It should allow for abortables and modern-day paradigms.

NOTE: Exllamav2 itself is not an asynchronous library. It's just
been added into tabby's async nature to allow for a fast and concurrent
API server. It's still being debated to run stream_ex in a separate
thread or manually manage it using asyncio.sleep(0)

Signed-off-by: kingbri <bdashore3@proton.me>

2024-03-16 23:23:31 -04:00

.github

Create pull request template

2024-02-09 14:53:29 -05:00

backends/exllamav2

Tree: Switch to async generators

2024-03-16 23:23:31 -04:00

colab

Tree: Refactor code organization

2024-01-25 00:15:40 -05:00

common

Tree: Switch to async generators

2024-03-16 23:23:31 -04:00

docker

Remove explicit pytorch & exllamav2 in Dockerfile

2024-02-25 18:03:01 +01:00

endpoints/OAI

Tree: Switch to async generators

2024-03-16 23:23:31 -04:00

loras

Implement lora support (#24 )

2023-12-08 23:38:08 -05:00

models

Tree: Update documentation and configs

2023-11-16 02:30:33 -05:00

sampler_overrides

API + Model: Add speculative ngram decoding

2024-03-13 23:32:11 -04:00

templates

Templates: Update folder

2023-12-18 23:53:47 -05:00

tests

Tree: Format

2024-03-13 00:02:55 -04:00

.gitignore

Tree: Unify sampler parameters and add override support

2024-01-25 00:15:40 -05:00

.ruff.toml

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

config_sample.yml

Additional clarification for override_base_seq_len

2024-03-02 09:29:50 -08:00

formatting.bat

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

formatting.sh

feat: workflows for formatting/linting (#35 )

2023-12-22 16:20:35 +00:00

LICENSE

Create LICENSE

2023-11-16 17:43:23 -05:00

main.py

Tree: Switch to async generators

2024-03-16 23:23:31 -04:00

README.md

Update README

2024-02-20 00:19:31 -05:00

requirements-amd.txt

Requirements: Add sse-starlette

2024-03-10 19:41:08 -04:00

requirements-cu118.txt

Requirements: Add sse-starlette

2024-03-10 19:41:08 -04:00

requirements-dev.txt

Requirements: Update Ruff

2024-03-13 00:02:55 -04:00

requirements-nowheel.txt

Requirements: Add sse-starlette

2024-03-10 19:41:08 -04:00

requirements.txt

Requirements: Add sse-starlette

2024-03-10 19:41:08 -04:00

start.bat

Tree: Format and cleanup start

2023-12-27 01:17:31 -05:00

start.py

Tree: Format

2024-03-13 00:02:55 -04:00

start.sh

Start: Add shell script

2023-12-27 23:53:14 -05:00

README.md

TabbyAPI

Important

In addition to the README, please read the Wiki page for information about getting started!

Note

Need help? Join the Discord Server and get the Tabby role. Please be nice when asking questions.

A FastAPI based application that allows for generating text using an LLM (large language model) using the Exllamav2 backend

Disclaimer

This API is considered as rolling release. There may be bugs and changes down the line. Please be aware that you might need to reinstall dependencies if needed.

Getting Started

Read the Wiki for more information. It contains user-facing documentation for installation, configuration, sampling, API usage, and so much more.

Supported Model Types

TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:

Exl2 (Highly recommended)
GPTQ
FP16 (using Exllamav2's loader)

Alternative Loaders/Backends

If you want to use a different model type than the ones listed above, here are some alternative backends with their own APIs:

Contributing

If you have issues with the project:

Describe the issues in detail
If you have a feature request, please indicate it as such.

If you have a Pull Request

Describe the pull request in detail, what, and why you are changing something

Developers and Permissions

Creators/Developers:

Languages

Python 95.3%

Jupyter Notebook 2.6%

Shell 0.8%

Batchfile 0.8%

Jinja 0.3%

Other 0.2%