tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-05-12 08:46:40 +00:00

Author	SHA1	Message	Date
Veden	f960fac8ff	Fix incorrect ratio calculation for draft model	2023-11-19 13:12:53 -08:00
kingbri	4cddd0400c	Model: Fix draft model loading Use draft_config to find the path instead of kwargs. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 02:04:02 -05:00
kingbri	698b0b1976	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 01:19:31 -05:00
kingbri	581e1fc219	Sample config: Remove unused value Draft models are specified in the draft sublock. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 01:16:03 -05:00
kingbri	e0e93c103b	Sample config: Uncomment all parameters This helps clarify things when users are configuring for the first time. For example, some users were putting the model name in the "model" block instead of the "model_name" field. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 01:12:07 -05:00
kingbri	63762654f0	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 01:05:49 -05:00
Brian Dashore	e46676cb08	Merge pull request #9 from city-unit/main Add basic docker support	2023-11-19 00:53:24 -05:00
kingbri	e4a8848445	Auth: Log API and admin key on startup Helpful for users who run headless or use Docker. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 00:52:39 -05:00
kingbri	31bc418795	Model: Add context in response output When printing to the console, give information about the context (ingestion token count). Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 00:49:32 -05:00
city_unit	80c69939ae	Remove unneeded stuffs	2023-11-19 00:34:54 -05:00
kingbri	f47919b1d3	API: Add draft model support Models can be loaded with a child object called "draft" in the POST request. Again, models need to be located within the draft model dir to get loaded. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-19 00:32:25 -05:00
city_unit	6b22dc0119	Rename, fschat support	2023-11-19 00:32:14 -05:00
city_unit	99cf0b6d7b	Add basic docker support	2023-11-19 00:01:17 -05:00
kingbri	6b9af58cc1	Tree: Fix extraneous bugs and update T/s print Model: Add extra information to print and fix the divide by zero error. Auth: Fix validation of API and admin keys to look for the entire key. References #7 and #6 Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-18 22:34:40 -05:00
kingbri	a51889bdb8	Requirements: Update Flash Attention Bump to version 2.3.3. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-18 22:28:24 -05:00
Brian Dashore	b2410a0436	Merge pull request #4 from waldfee/config_samples Adds draft model support to config.yml	2023-11-18 13:16:23 -05:00
kingbri	27ebec3b35	Model: Add speculative decoding support via config Speculative decoding makes use of draft models that ingest the prompt before forwarding it to the main model. Add options in the config to support this. API options will occur in a different commit. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-18 01:42:20 -05:00
kingbri	2ad79cb9ea	Model: Add tokens in responses Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 23:33:48 -05:00
kingbri	7f18ea1d7c	Tree: Remove SillyTavern shim docs Support has been added in SillyTavern's staging branch. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 22:03:46 -05:00
kingbri	6f2078cbe4	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 22:02:21 -05:00
kingbri	d627d14385	API: Fix exceptions and defaults Stop conditions was None, causing model to error out when trying to add the EOS token to a None value. Authentication failed when Bearer contained an empty string. To fix this, add a condition which checks array length. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 17:56:05 -05:00
waldfee	78a6587b95	add cache_mode and draft_model_dir to config_sample.yml	2023-11-17 22:08:31 +01:00
kingbri	4669e49ff0	API: Fix errors with token endpoint Handle None cases if the provided text/token lists are empty. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 01:39:06 -05:00
kingbri	9dfa580b1e	Model: Add tokens/second output Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 01:16:20 -05:00
kingbri	021981fce0	API: Re-add depends endpoints Mistakenly removed API key authentication for the models endpoints in testing. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-17 00:50:42 -05:00
kingbri	ac4e9c2277	API: Add CORS support Tell CORS to go fly a kite. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 22:19:47 -05:00
kingbri	08a183540b	Config: Add warning on exceptions and clarify parameters Due to how YAML works, double quotes are bad. Specify a linter in the top of the config_sample file. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 22:19:47 -05:00
Splice86	feef782dbf	Update requirements.txt to include uvicorn	2023-11-16 22:50:27 +00:00
Brian Dashore	d5374c2c1f	Create LICENSE Use AGPLv3 for this project Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:43:23 -05:00
kingbri	2cf93c092b	Add SillyTavern instructions Temporary until proper support is added in. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:33:23 -05:00
kingbri	b20e71dcd4	Requirements: Add Flash Attention 2 wheels Update to 2.3.3 at some point. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:25:00 -05:00
kingbri	d5551352bf	Model: Fix parsing of stop conditions Add the EOS token into stop strings after checking kwargs. If ban_eos_token is on, don't add the EOS token in for extra measure. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:15:33 -05:00
kingbri	282b5b2931	API: Fix responses and some params Responses were not being properly sent as JSON. Only run pydantic's JSON function on stream responses. FastAPI does the rest with static responses. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 17:11:55 -05:00
kingbri	d8d61fa19b	API: Add fallback if model isn't loaded Most endpoints require the model to be loaded, so add a depends. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 12:20:35 -05:00
kingbri	c0525c042e	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 12:06:37 -05:00
kingbri	60eb076b43	Tree: Basic formatting and comments Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 11:48:40 -05:00
kingbri	5defb1b0b4	Config: Fix errors when stuff doesn't exist Add safe fallbacks if any part of the config tree doesn't exist. This prevents random internal server errors from showing up. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 11:41:03 -05:00
kingbri	03f45cb0a3	Tree: Update documentation and configs Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 02:30:33 -05:00
kingbri	2248705c4a	Requirements: Don't force fastchat installation Fastchat requires a lot of dependencies such as transformers, peft, and accelerate which are heavy. This is not useful unless a user wants to add a shim for the chat completion endpoint. Instead, try importing fastchat and notify the console of the error. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 01:26:46 -05:00
kingbri	5e8419ec0c	OAI: Add chat completions endpoint Chat completions is the endpoint that will be used by OAI in the future. Makes sense to support it even though the completions endpoint will be used more often. Also unify common parameters between the chat completion and completion requests since they're very similar. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-16 01:06:07 -05:00
kingbri	593471a04d	Auth: Fix init from YAML dict A class can't have multiple constructors. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 23:00:12 -05:00
kingbri	1f444c8fb7	Requirements: Add fastchat and override pydantic Use an older version of pydantic to stay compatible Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 01:00:08 -05:00
kingbri	bbb59d0747	Auth: Fix methods for writing and validation These were not working properly. Make the YAML file get written to properly and the validator to return a 401 when the bearer token is invalid. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	cb8da7f092	Chore: Remove mistakenly committed file Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	d0b6b11068	OAI: Make freq and presence pen floats Also rename the completions typing file. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	126afdfdc2	Model: Fix gpu split params GPU split auto is a bool and GPU split is an array of integers for GBs to allocate per GPU. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	ea91d17a11	Api: Add ban_eos_token and add_bos_token support Adds the ability for the client to specify whether to add the BOS token and ban the EOS token. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	8fea5391a8	Api: Add token endpoints Support for encoding and decoding with various parameters. Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
kingbri	2d741653c3	Update .gitignore Signed-off-by: kingbri <bdashore3@proton.me>	2023-11-15 00:55:15 -05:00
Splice86	fc14046318	Updated readme	2023-11-14 21:17:03 -06:00

1 2

68 Commits