tabbyAPI

mirror of https://github.com/theroyallab/tabbyAPI.git synced 2026-03-15 00:07:28 +00:00

Author	SHA1	Message	Date
kingbri	3605067898	Requirements: Don't use torch 2.2 Pytorch released 2.2 without letting the community know first. Pin the torch version to 2.1.2 until exllamav2 builds for torch 2.2 Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-29 23:30:10 -05:00
kingbri	751627e571	OAI: Add fasttensors to model load endpoint Also fix logging when loading prompt templates. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 01:08:02 -05:00
kingbri	fc4570220c	API + Model: Add new parameters and clean up documentation The example JSON fields were changed because of the new sampler default strategy. Fix these by manually changing the values. Also add support for fasttensors and expose generate_window to the API. It's recommended to not adjust generate_window as it's dynamically scaled based on max_seq_len by default. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	90fb41a77a	Model: Fix prompt template initialization The previous commit iterated through multiple try conditions which made it so the user has to provide a dummy prompt template. Now, template loading is fallback based. Run through a loop of functions and return if one of them succeeds. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	740b0215dd	Model: Dynamically scale generate_window Allows for adjustment of reservation space at the end of the context before rolling it. This should be scaled as a model's max_seq_len goes up. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	b14c5443fd	API: Add sampler override switching Allow users to switch the currently overriden samplers via the API so a restart isn't required to switch the overrides. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	de0ba7214c	API: Add template switching and unload endpoints Templates can be switched and unloaded without reloading the entire model. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	6c30f24c83	Tree: Unify sampler parameters and add override support Unify API sampler params into a superclass which should make them easier to manage and inherit generic functions from. Not all frontends expose all sampling parameters due to connections with OAI (that handles sampling themselves with the exception of a few sliders). Add the ability for the user to customize fallback parameters from server-side. In addition, parameters can be forced to a certain value server-side in case the repo automatically sets other sampler values in the background that the user doesn't want. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	78f920eeda	Tree: Refactor code organization Move common functions into their own folder and refactor the backends to use their own folder as well. Also cleanup imports and alphabetize import statments themselves. Finally, move colab and docker into their own folders as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	ee99349a78	Requirements: Bump exllamav2 0.0.12 Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-22 21:13:31 -05:00
kingbri	902e841c39	Main: Add logging for API routes Helps users get started with accessing the docs. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-10 23:50:11 -05:00
kingbri	7a29664f06	API: Add alias names to field descriptions Helps with understanding API aliases. These aliases should not be used but are helpful for developers who want frontend compat. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-08 23:00:33 -05:00
Brian Dashore	1dbebd48eb	Merge pull request #50 from djmaze/patch-1 Remove fschat from compose yaml	2024-01-06 00:10:20 -05:00
Martin Honermeyer	6ab02e1eeb	Remove fschat from compose yaml fschat has been removed from the Dockerfile a while ago.	2024-01-06 02:18:26 +01:00
kingbri	81b504e8c5	OAI: Fix typical alias AliasChoices takes strings, not an array. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-05 16:38:39 -05:00
kingbri	2c57dafc59	OAI: Add alias for typical sampling Typical can also be called typical_p Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-05 15:29:53 -05:00
kingbri	d4ed9f703d	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 21:13:30 -05:00
kingbri	c1642076c2	API: Switch unload method to POST GET and POST can be used interchangeably in this case, but adhere to the HTTP spec. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 21:11:36 -05:00
kingbri	cd4bf99598	OAI: Fix autodoc examples for model loading Some values weren't defaulting to correct values. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 20:53:56 -05:00
kingbri	ceb388e8a0	Start: Override ROCm env variables These are used for supporting GPUs that are not on the "officially supported list". Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 21:01:18 -05:00
Brian Dashore	c980f35e1b	Merge pull request #47 from Baysul/patch-1 Only try to install one of the EXLv2 wheels	2024-01-02 20:58:59 -05:00
Basil	2460b2f8ef	Only try to install one of the EXLv2 wheels ...depending on Python version.	2024-01-02 16:56:39 -08:00
kingbri	451042aadf	Main: Don't load if model_name/loras is blank Previously, if model_name was commented out, a load would not occur. Add the case if model_name or loras is blank which returns None when parsing the YAML. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 13:56:25 -05:00
kingbri	6b04463051	API: Fix CFG reporting THe model endpoint wasn't reporting if CFG is on. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 13:54:16 -05:00
kingbri	bbd4ee54ca	Model: Add fallback if negative prompt is empty Fallback to the BOS token since an empty string won't do anything. Ideally, an empty negative prompt should not be used, but it's not the end of the world. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 01:46:51 -05:00
kingbri	b378773d0a	Model: Add CFG support CFG, or classifier-free guidance helps push a model in different directions based on what the user provides. Currently, CFG is ignored if the negative prompt is blank (it shouldn't be used in that way anyways). Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 01:46:51 -05:00
kingbri	bb7a8e4614	Config: Add override argparser Add an argparser that casts over to dictionaries of subgroups to integrate with the config. This argparser doesn't contain everything in the config due to complexity issues with CLI args, but will eventually progress to parity. In addition, it's used to override the config.yml rather than replace it. A config arg is also provided if the user wants to fully override the config yaml with another file path. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-01 14:27:12 -05:00
kingbri	7176fa66f0	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:25:18 -05:00
kingbri	979a9d28a3	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:22:18 -05:00
kingbri	528d20ca5b	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:21:13 -05:00
kingbri	72bc30343c	Model: Fix frequency penalty fallback The appropriate branches weren't firing when frequency penalty is 0.0. Also fix repetition penalty overriding. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:21:07 -05:00
kingbri	47744fe9f7	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 01:48:10 -05:00
kingbri	0dc12d82d5	Model: Add fallback for freq and presence pen Previous behavior aliased freq pen for rep pen. Keep this behavior when using the freq pen parameter with a legacy exllamav2 version rather than ignoring both entirely. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-30 00:24:15 -05:00
kingbri	79a57588d5	API: Add template list endpoint Fetches all template names that a user has in the templates directory for chat completions. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 22:58:55 -05:00
kingbri	dce8c74edc	API: Add clarification and cleanup autodocs It's possible to override parts of the example JSON to give proper examples of values. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 10:28:06 -05:00
kingbri	4136f19058	Config: Make the sample a drop-in solution With the new wiki, all parameters are fully documented along with comments in the YAML file itself. This should help new users who pull, copy the config, and can't start the API due to subsections being uncommented and read. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 01:36:21 -05:00
kingbri	ec929728d9	Model: Read scale_pos_emb from config In newer versions of exllamav2, this value is read from the model's config.json. This value will still default to 1.0 anyways. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 21:14:24 -05:00
city-unit	e70729b0c0	Update Docker Squash commit that merges #43, #44, and #45 Create .dockerignore Make compose marginally better Un-scuffed the Dockerfile	2023-12-28 18:26:04 -05:00
kingbri	5dc2df68be	Model: Repetition penalty range -> penalty range All penalties can have a sustain (range) applied to them in exl2, so clarify the parameter. However, the default behaviors change based on if freq OR pres pen is enabled. For the sanity of OAI users, have freq and pres pen only apply on the output tokens when range is -1 (default). But, repetition penalty still functions the same way where -1 means the range is the max seq len. Doing this prevents gibberish output when using the more modern freq and presence penalties similar to llamacpp. NOTE: This logic is still subject to change in the future, but I believe it hits the happy medium for users who want defaults and users who want to tinker around with the sampling knobs. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 18:16:10 -05:00
kingbri	c72d30918c	Config: Default None -> Empty in comments Empty makes more sense when talking about empty fields. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:32:29 -05:00
kingbri	f56221ff0c	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:31:59 -05:00
kingbri	3622710582	API: Fix num_experts_per_token reporting This wasn't linked to the model config. This value can be 1 if a MoE model isn't loaded. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:31:14 -05:00
kingbri	c5bbfd97b2	Entrypoint: Load loras after model Prevents an error if the model isn't loaded on startup. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 23:55:02 -05:00
kingbri	ee84d892b8	Start: Add shell script Same as the batch file. Also edit the python script to work when a venv is clean. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 23:53:14 -05:00
kingbri	ac0d6f8869	Tree: Format and cleanup start Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 01:17:31 -05:00
kingbri	4d83d1aae4	Start: Switch to python script Direct python can be used for requirements checking. Remove the ps1 script and create a venv purely in batch. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 00:37:53 -05:00
kingbri	a71b96a20c	Main: Switch to entrypoint Allows for other modules to access the startup function. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 00:34:50 -05:00
kingbri	e92ef8f5c7	OAI: Fix rep pen range alias No need to unwrap because the Pydantic alias does that for us. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:37:11 -05:00
kingbri	7b74cb28e6	Model: Move unsupported sampler check Overbloated the generation function. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:29:51 -05:00
kingbri	e256ff8182	Samplers: Add frequency and presence penalty Un-alias repetition penalty from the frequency penalty parameter. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:27:32 -05:00

1 2 3 4 5 ...

353 Commits