Commit Graph

30 Commits

Author SHA1 Message Date
kingbri
2642ef7156 OAI: Update logprobs type
Some logprobs cannot exist, so make the type optional

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-08 21:26:53 -05:00
kingbri
b827bcbb44 Sampling: Cleanup and update
Cleanup how overrides are handled, class naming, and adopt exllamav2's
model class to enforce latest stable version methods rather than
adding multiple backwards compatability checks.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-02 23:36:17 -05:00
kingbri
d3781920b3 OAI: Split up utility functions
Just like types, put utility functions in their own separate module
based on the route.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-02-02 23:36:17 -05:00
kingbri
6c30f24c83 Tree: Unify sampler parameters and add override support
Unify API sampler params into a superclass which should make them
easier to manage and inherit generic functions from.

Not all frontends expose all sampling parameters due to connections
with OAI (that handles sampling themselves with the exception of
a few sliders).

Add the ability for the user to customize fallback parameters from
server-side.

In addition, parameters can be forced to a certain value server-side
in case the repo automatically sets other sampler values in the
background that the user doesn't want.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
7a29664f06 API: Add alias names to field descriptions
Helps with understanding API aliases. These aliases should not be
used but are helpful for developers who want frontend compat.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-08 23:00:33 -05:00
kingbri
81b504e8c5 OAI: Fix typical alias
AliasChoices takes strings, not an array.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-05 16:38:39 -05:00
kingbri
2c57dafc59 OAI: Add alias for typical sampling
Typical can also be called typical_p

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-05 15:29:53 -05:00
kingbri
b378773d0a Model: Add CFG support
CFG, or classifier-free guidance helps push a model in different
directions based on what the user provides.

Currently, CFG is ignored if the negative prompt is blank (it shouldn't
be used in that way anyways).

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-02 01:46:51 -05:00
kingbri
dce8c74edc API: Add clarification and cleanup autodocs
It's possible to override parts of the example JSON to give proper
examples of values.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-29 10:28:06 -05:00
kingbri
5dc2df68be Model: Repetition penalty range -> penalty range
All penalties can have a sustain (range) applied to them in exl2,
so clarify the parameter.

However, the default behaviors change based on if freq OR pres pen
is enabled. For the sanity of OAI users, have freq and pres pen only
apply on the output tokens when range is -1 (default).

But, repetition penalty still functions the same way where -1 means
the range is the max seq len.

Doing this prevents gibberish output when using the more modern freq
and presence penalties similar to llamacpp.

NOTE: This logic is still subject to change in the future, but I believe
it hits the happy medium for users who want defaults and users who want
to tinker around with the sampling knobs.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-28 18:16:10 -05:00
kingbri
e92ef8f5c7 OAI: Fix rep pen range alias
No need to unwrap because the Pydantic alias does that for us.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-25 15:37:11 -05:00
kingbri
e256ff8182 Samplers: Add frequency and presence penalty
Un-alias repetition penalty from the frequency penalty parameter.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-25 15:27:32 -05:00
kingbri
80ef379721 Sampling: Add top-a support
Currently in exllamav2 dev, but will be in the next release.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-22 23:50:24 -05:00
AlpinDale
fa47f51f85 feat: workflows for formatting/linting (#35)
* add github workflows for pylint and yapf

* yapf

* docstrings for auth

* fix auth.py

* fix generators.py

* fix gen_logging.py

* fix main.py

* fix model.py

* fix templating.py

* fix utils.py

* update formatting.sh to include subdirs for pylint

* fix model_test.py

* fix wheel_test.py

* rename utils to utils_oai

* fix OAI/utils_oai.py

* fix completion.py

* fix token.py

* fix lora.py

* fix common.py

* add pylintrc and fix model.py

* finish up pylint

* fix attribute error

* main.py formatting

* add formatting batch script

* Main: Remove unnecessary global

Linter suggestion.

Signed-off-by: kingbri <bdashore3@proton.me>

* switch to ruff

* Formatting + Linting: Add ruff.toml

Signed-off-by: kingbri <bdashore3@proton.me>

* Formatting + Linting: Switch scripts to use ruff

Also remove the file and recent file change functions from both
scripts.

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Format and lint

Signed-off-by: kingbri <bdashore3@proton.me>

* Scripts + Workflows: Format

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Remove pylint flags

We use ruff now

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Format

Signed-off-by: kingbri <bdashore3@proton.me>

* Formatting: Line length is 88

Use the same value as Black.

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Format

Update to new line length rules.

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Authored-by: AlpinDale <52078762+AlpinDale@users.noreply.github.com>
Co-authored-by: kingbri <bdashore3@proton.me>
2023-12-22 16:20:35 +00:00
kingbri
c3f7898967 OAI: Add logit bias support
Use exllamav2's token bias which is the functional equivalent of
OAI's logit bias parameter.

Strings are casted to integers on request and errors if an invalid
value is passed.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-18 23:53:47 -05:00
kingbri
bc21f0bbc0 OAI: Add field aliasing
Repetition penalty range needs field aliases to support multiple
parameter calls.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-18 23:53:47 -05:00
kingbri
e895eaa4bd OAI: Clarify types in docs
Adding field descriptions show which parameters are used solely for
OAI compliance and not actually parsed in the model code.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-18 23:53:47 -05:00
kingbri
ed868fd262 OAI: Remove unused parameters
Seed and low_mem aren't used, so comment them out.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-15 14:56:43 -05:00
kingbri
5ae2a91c04 Tree: Use unwrap and coalesce for optional handling
Python doesn't have proper handling of optionals. The only way to
handle them is checking via an if statement if the value is None or
by using the "or" keyword to unwrap optionals.

Previously, I used the "or" method to unwrap, but this caused issues
due to falsy values falling back to the default. This is especially
the case with booleans were "False" changed to "True".

Instead, add two new functions: unwrap and coalesce. Both function
to properly implement a functional way of "None" coalescing.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-09 21:52:17 -05:00
kingbri
6a71890d45 Model: Fix sampler bugs
Lots of bugs were unearthed when switching to the new fallback changes.
Fix them and make sure samplers are being set properly.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-12-06 17:29:58 -05:00
kingbri
e703c716ee Merge branch 'main' of https://github.com/ziadloo/tabbyAPI into ziadloo-main 2023-11-30 01:01:48 -05:00
kingbri
3957316b79 Revert "API: Rename repetition_decay -> repetition_slope"
This reverts commit cad144126f.

Change this parameter back to repetition_decay. This is different than
rep_pen_slope used in other backends such as kobold and NAI.

Still keep the fallback condition though.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-29 22:03:45 -05:00
kingbri
cad144126f API: Rename repetition_decay -> repetition_slope
Also fix the fallback to use 0 for sanity checking and validation.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-29 01:13:05 -05:00
kingbri
5cbf7f13da OAI: Fix repetition range
Alias repetition_penalty_range to repetition_range since that's used
as an internal variable. Perhaps in the future, there should be a function
that allows for iterating through request aliases and give a default value.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-29 00:53:19 -05:00
Mehran Ziadloo
ead503c75b Adding token usage support 2023-11-27 20:05:05 -08:00
kingbri
71b9a53336 API: Add temperature_last support
Documented in previous commits. Also make sure that for version checking,
check the value of kwargs instead of if the key is present since requests
pass default values.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-21 21:20:59 -05:00
kingbri
d627d14385 API: Fix exceptions and defaults
Stop conditions was None, causing model to error out when trying to
add the EOS token to a None value.

Authentication failed when Bearer contained an empty string. To fix
this, add a condition which checks array length.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-17 17:56:05 -05:00
kingbri
282b5b2931 API: Fix responses and some params
Responses were not being properly sent as JSON. Only run pydantic's
JSON function on stream responses. FastAPI does the rest with static
responses.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-16 17:11:55 -05:00
kingbri
5e8419ec0c OAI: Add chat completions endpoint
Chat completions is the endpoint that will be used by OAI in the
future. Makes sense to support it even though the completions
endpoint will be used more often.

Also unify common parameters between the chat completion and completion
requests since they're very similar.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-16 01:06:07 -05:00
kingbri
b625bface9 OAI: Add API-based model loading/unloading and auth routes
Models can be loaded and unloaded via the API. Also add authentication
to use the API and for administrator tasks.

Both types of authorization use different keys.

Also fix the unload function to properly free all used vram.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-14 01:17:19 -05:00