tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-05-11 16:30:16 +00:00

Author	SHA1	Message	Date
kingbri	f47919b1d3	API: Add draft model support Models can be loaded with a child object called "draft" in the POST request. Again, models need to be located within the draft model dir to get loaded. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 00:32:25 -05:00
kingbri	d627d14385	API: Fix exceptions and defaults Stop conditions was None, causing model to error out when trying to add the EOS token to a None value. Authentication failed when Bearer contained an empty string. To fix this, add a condition which checks array length. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 17:56:05 -05:00
kingbri	282b5b2931	API: Fix responses and some params Responses were not being properly sent as JSON. Only run pydantic's JSON function on stream responses. FastAPI does the rest with static responses. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:11:55 -05:00
kingbri	60eb076b43	Tree: Basic formatting and comments Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 11:48:40 -05:00
kingbri	2248705c4a	Requirements: Don't force fastchat installation Fastchat requires a lot of dependencies such as transformers, peft, and accelerate which are heavy. This is not useful unless a user wants to add a shim for the chat completion endpoint. Instead, try importing fastchat and notify the console of the error. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 01:26:46 -05:00
kingbri	5e8419ec0c	OAI: Add chat completions endpoint Chat completions is the endpoint that will be used by OAI in the future. Makes sense to support it even though the completions endpoint will be used more often. Also unify common parameters between the chat completion and completion requests since they're very similar. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 01:06:07 -05:00
kingbri	d0b6b11068	OAI: Make freq and presence pen floats Also rename the completions typing file. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	126afdfdc2	Model: Fix gpu split params GPU split auto is a bool and GPU split is an array of integers for GBs to allocate per GPU. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	ea91d17a11	Api: Add ban_eos_token and add_bos_token support Adds the ability for the client to specify whether to add the BOS token and ban the EOS token. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	8fea5391a8	Api: Add token endpoints Support for encoding and decoding with various parameters. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	4670a77c26	API: Don't use response_class This arg in routes caused many errors and isn't even needed for responses. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-14 22:09:26 -05:00
kingbri	b625bface9	OAI: Add API-based model loading/unloading and auth routes Models can be loaded and unloaded via the API. Also add authentication to use the API and for administrator tasks. Both types of authorization use different keys. Also fix the unload function to properly free all used vram. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-14 01:17:19 -05:00
kingbri	47343e2f1a	OAI: Add models support The models endpoint fetches all the models that OAI has to offer. However, since this is an OAI clone, just list the models inside the user's configured model directory instead. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-13 21:38:34 -05:00
kingbri	eee8b642bd	OAI: Implement completion API endpoint Add support for /v1/completions with the option to use streaming if needed. Also rewrite API endpoints to use async when possible since that improves request performance. Model container parameter names also needed rewrites as well and set fallback cases to their disabled values. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-13 18:31:26 -05:00

14 Commits