tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-03-15 00:07:28 +00:00

Author	SHA1	Message	Date
kingbri	4cc0b59bdc	Requirements: Add sse-starlette Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-10 19:41:08 -04:00
kingbri	2025a1c857	Requirements: Unpin uvicorn v0.28.0 works now and the underlying errors were fixed. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-10 17:48:43 -04:00
kingbri	e33971859b	Requirements: Pin uvicorn Pin uvicorn due to issues with request disconnection in the latest version. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-10 01:23:36 -05:00
kingbri	228c227c1e	Logging: Switch to loguru Loguru is a flexible logger that allows for easier hooking and imports into Rich with no problems. Also makes progress bars stick to the bottom of the terminal window. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-08 01:00:48 -05:00
kingbri	fe0ff240e7	Progress: Switch to Rich Rich is a more mature library for displaying progress bars, logging, and console output. This should help properly align progress bars within the terminal. Side note: "We're Rich!" Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-08 01:00:48 -05:00
kingbri	39617adb65	Requirements: Update Exllamav2 v0.0.15 Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-06 22:29:55 -05:00
kingbri	ccd41d720d	Requirements: Bump ExllamaV2 v0.0.14 Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-24 12:26:08 -05:00
kingbri	ea00a6bd45	Requirements: Update Exllamav2 Update to v0.0.13.post2 Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-14 21:51:25 -05:00
kingbri	321c9a1ea9	Requirements: Fix FA2 version number The URL wasn't edited correctly Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-07 21:37:30 -05:00
kingbri	d0027bce32	Requirements: Update flash attention 2 for Windows Version 2.5.2 Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-07 20:44:23 -05:00
kingbri	543a9b68c8	Requirements: Update Exllamav2 to 0.0.13.post1 Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-04 21:25:57 -05:00
kingbri	6eeb62b82c	Requirements: Update exllamav2, torch, and FA2 Torch to 2.2, exllamav2 to 0.0.13, FA2 to 2.4.2 on Windows and 2.5.2 on Linux. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-02 23:53:42 -05:00
kingbri	3605067898	Requirements: Don't use torch 2.2 Pytorch released 2.2 without letting the community know first. Pin the torch version to 2.1.2 until exllamav2 builds for torch 2.2 Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-29 23:30:10 -05:00
kingbri	ee99349a78	Requirements: Bump exllamav2 0.0.12 Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-22 21:13:31 -05:00
kingbri	162c13752a	Requirements: Update to Flash Attention 2.4.1 Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 14:40:08 -05:00
AlpinDale	6a5bbd217c	feat: logging (#39 ) * add logging * simplify the logger * formatting * final touches * fix format * Model: Add log to metrics Signed-off-by: kingbri <bdashore3@proton.me> --------- Authored-by: AlpinDale <52078762+AlpinDale@users.noreply.github.com>	2023-12-23 04:33:31 +00:00
kingbri	da69ad8cd3	Requirements: Pin versions for some dependencies Pydantic and Jinja2 need pinned versions. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-19 21:48:04 -05:00
kingbri	51ca1ff396	Tree: Switch to Pydantic 2 Pydantic 2 has more modern methods and stability compared to Pydantic 1 Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-18 23:53:47 -05:00
kingbri	f631dd6ff7	Templates: Switch to Jinja2 Jinja2 is a lightweight template parser that's used in Transformers for parsing chat completions. It's much more efficient than Fastchat and can be imported as part of requirements. Also allows for unblocking Pydantic's version. Users now have to provide their own template if needed. A separate repo may be usable for common prompt template storage. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-18 23:53:47 -05:00
kingbri	f196f1177d	Requirements: Update exllamav2 to 0.0.11 Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-16 19:33:42 -05:00
kingbri	47176a2a1e	Requirements: Fix torch install Use --extra-index-url to install pytorch. This should be secure enough since dependency confusion attacks aren't possible with just installing the torch package. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-06 19:04:35 -05:00
kingbri	b83e1b704e	Requirements: Split for configurations Add self-contained requirements for cuda 11.8 and ROCm Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-06 00:00:30 -05:00
kingbri	621e11b940	Update documentation Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-05 00:33:43 -05:00
kingbri	e740b53478	Requirements: Update Flash Attention 2 Bump to 2.3.6 Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-03 01:56:29 -05:00
kingbri	ae69b18583	API: Use FastAPI streaming instead of sse_starlette sse_starlette kept firing a ping response if it was taking too long to set an event. Rather than using a hacky workaround, switch to FastAPI's inbuilt streaming response and construct SSE requests with a utility function. This helps the API become more robust and removes an extra requirement. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-01 01:54:35 -05:00
kingbri	d25310e55d	Requirements: Update Flash Attention 2 Use 2.3.4 from tgw. However, keep the 2.3.3 wheels in requirements if the newer wheels don't work for now. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-21 22:12:55 -05:00
kingbri	a51889bdb8	Requirements: Update Flash Attention Bump to version 2.3.3. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-18 22:28:24 -05:00
Splice86	feef782dbf	Update requirements.txt to include uvicorn	2023-11-16 22:50:27 +00:00
kingbri	b20e71dcd4	Requirements: Add Flash Attention 2 wheels Update to 2.3.3 at some point. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:25:00 -05:00
kingbri	03f45cb0a3	Tree: Update documentation and configs Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 02:30:33 -05:00
kingbri	2248705c4a	Requirements: Don't force fastchat installation Fastchat requires a lot of dependencies such as transformers, peft, and accelerate which are heavy. This is not useful unless a user wants to add a shim for the chat completion endpoint. Instead, try importing fastchat and notify the console of the error. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 01:26:46 -05:00
kingbri	1f444c8fb7	Requirements: Add fastchat and override pydantic Use an older version of pydantic to stay compatible Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 01:00:08 -05:00
kingbri	eee8b642bd	OAI: Implement completion API endpoint Add support for /v1/completions with the option to use streaming if needed. Also rewrite API endpoints to use async when possible since that improves request performance. Model container parameter names also needed rewrites as well and set fallback cases to their disabled values. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-13 18:31:26 -05:00
kingbri	a10c14d357	Config: Switch to YAML and add load progress YAML is a more flexible format when it comes to configuration. Commandline arguments are difficult to remember and configure especially for an API with complicated commandline names. Rather than using half-baked textfiles, implement a proper config solution. Also add a progress bar when loading models in the commandline. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-12 00:21:16 -05:00
david	b967e2e604	Initial	2023-11-09 21:27:45 -06:00

35 Commits