kingbri
ee84d892b8
Start: Add shell script
...
Same as the batch file. Also edit the python script to work when
a venv is clean.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-27 23:53:14 -05:00
kingbri
ac0d6f8869
Tree: Format and cleanup start
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-27 01:17:31 -05:00
kingbri
4d83d1aae4
Start: Switch to python script
...
Direct python can be used for requirements checking. Remove the ps1
script and create a venv purely in batch.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-27 00:37:53 -05:00
kingbri
a71b96a20c
Main: Switch to entrypoint
...
Allows for other modules to access the startup function.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-27 00:34:50 -05:00
kingbri
e92ef8f5c7
OAI: Fix rep pen range alias
...
No need to unwrap because the Pydantic alias does that for us.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 15:37:11 -05:00
kingbri
7b74cb28e6
Model: Move unsupported sampler check
...
Overbloated the generation function.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 15:29:51 -05:00
kingbri
e256ff8182
Samplers: Add frequency and presence penalty
...
Un-alias repetition penalty from the frequency penalty parameter.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 15:27:32 -05:00
kingbri
442bb59f8f
Tests: Remove logger class
...
The logger module could not be found when calling the test. Re-add
the color logging at a later time.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 15:20:39 -05:00
kingbri
162c13752a
Requirements: Update to Flash Attention 2.4.1
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 14:40:08 -05:00
kingbri
5c08316d18
Start: Switch to Write-Host
...
Write-Output is equal to a return statement and breaks parts of
the script.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 11:59:58 -05:00
kingbri
670ccac19a
Start: Add option to not install wheels
...
Building from source is a case for many wheels, so add an option
to skip wheel upgrades/installation if the user uses the start script.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 11:49:56 -05:00
kingbri
09ae71aa91
OAI: Add finish to completions
...
OAI spec requires [DONE] to be sent over SSE to signal that a generation
is completed.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-25 11:25:38 -05:00
kingbri
cc3229c109
Scripts: Make Start.bat idiotproof
...
Start now creates a venv, installs the correct requirements, and
starts the API.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-24 20:50:24 -05:00
kingbri
060d422e03
Config: Resolve filepath
...
This maps the absolute path when loading the config file. Making
things safer when loading and finding the correct path.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-23 23:57:33 -05:00
kingbri
703a114f63
Tree: Format
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-23 23:03:28 -05:00
kingbri
c9126c3145
Config: Isolate to a separate file
...
Reduce dependency of globals in main to simplify code a bit.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-23 23:02:37 -05:00
kingbri
0d2e726e82
Main: Fix import formatting
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-23 21:33:15 -05:00
kingbri
3461f8294f
Logging: Clarify preferences
...
Preferences are preferences, not a config.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-23 21:08:10 -05:00
kingbri
98a7b951b9
Logging: Add newlines to Prompt and Response
...
Makes things clearer rather than adding an extra space.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-22 23:55:22 -05:00
kingbri
80ef379721
Sampling: Add top-a support
...
Currently in exllamav2 dev, but will be in the next release.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-22 23:50:24 -05:00
AlpinDale
6a5bbd217c
feat: logging ( #39 )
...
* add logging
* simplify the logger
* formatting
* final touches
* fix format
* Model: Add log to metrics
Signed-off-by: kingbri <bdashore3@proton.me >
---------
Authored-by: AlpinDale <52078762+AlpinDale@users.noreply.github.com >
2023-12-23 04:33:31 +00:00
Brian Dashore
f5314fcdad
Merge pull request #37 from DocShotgun/main
...
Colab: Expose new config arguments
2023-12-22 12:07:52 -05:00
kingbri
71f6a586f1
Templates: Add error handling for template errors
...
Similar to the transformers library, add an error handler when an
exception is fired. This relays the error to the user.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-22 11:59:47 -05:00
AlpinDale
fa47f51f85
feat: workflows for formatting/linting ( #35 )
...
* add github workflows for pylint and yapf
* yapf
* docstrings for auth
* fix auth.py
* fix generators.py
* fix gen_logging.py
* fix main.py
* fix model.py
* fix templating.py
* fix utils.py
* update formatting.sh to include subdirs for pylint
* fix model_test.py
* fix wheel_test.py
* rename utils to utils_oai
* fix OAI/utils_oai.py
* fix completion.py
* fix token.py
* fix lora.py
* fix common.py
* add pylintrc and fix model.py
* finish up pylint
* fix attribute error
* main.py formatting
* add formatting batch script
* Main: Remove unnecessary global
Linter suggestion.
Signed-off-by: kingbri <bdashore3@proton.me >
* switch to ruff
* Formatting + Linting: Add ruff.toml
Signed-off-by: kingbri <bdashore3@proton.me >
* Formatting + Linting: Switch scripts to use ruff
Also remove the file and recent file change functions from both
scripts.
Signed-off-by: kingbri <bdashore3@proton.me >
* Tree: Format and lint
Signed-off-by: kingbri <bdashore3@proton.me >
* Scripts + Workflows: Format
Signed-off-by: kingbri <bdashore3@proton.me >
* Tree: Remove pylint flags
We use ruff now
Signed-off-by: kingbri <bdashore3@proton.me >
* Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me >
* Formatting: Line length is 88
Use the same value as Black.
Signed-off-by: kingbri <bdashore3@proton.me >
* Tree: Format
Update to new line length rules.
Signed-off-by: kingbri <bdashore3@proton.me >
---------
Authored-by: AlpinDale <52078762+AlpinDale@users.noreply.github.com >
Co-authored-by: kingbri <bdashore3@proton.me >
2023-12-22 16:20:35 +00:00
kingbri
a14abfe21c
Templates: Support bos_token and eos_token fields
...
These are commonly seen in huggingface provided chat templates and
aren't that difficult to add in.
For feature parity, honor the add_bos_token and ban_eos_token
parameters when constructing the prompt.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-22 10:33:11 -05:00
DocShotgun
7967607f12
Colab: Expose new config arguments
2023-12-22 01:53:13 -08:00
Brian Dashore
2bf8087de3
Merge pull request #36 from veden/dev
2023-12-22 00:34:19 -05:00
Veden
91e6823b24
fixed method invocation in get_template_from_model_json
2023-12-21 21:25:59 -08:00
kingbri
8fa764bfbe
Auth: Add option to disable authentication
...
This creates a massive security hole, but it's gated behind a flag
for users who only use localhost.
A warning will pop up when users disable authentication.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-21 23:40:16 -05:00
kingbri
99a798e117
API: Add auth enforcement to draft list
...
This didn't have an API key gate.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-21 23:14:04 -05:00
kingbri
5d80a049ae
Templates: Switch to common function for JSON loading
...
Fix redundancy in code when loading templates. However, loading
a template from config.json may be a mistake since tokenizer_config.json
is the main place where chat templates are stored.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-21 23:08:51 -05:00
kingbri
72e19dbc12
Config: Change default dirs in sample
...
Models and draft models default to the models directory while
loras default to the loras directory.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-21 22:35:03 -05:00
Brian Dashore
87a9dfc8c4
Merge pull request #34 from veden/dev
...
Templates: Added automatic detection of chat templates from tokenizer_config.json
2023-12-21 22:34:53 -05:00
kingbri
1a8afcb6ad
Generator: Fix semaphore scheduling
...
Non-streaming tasks were not regulated by the semaphore, causing these
tasks to interfere with streaming generations. Add helper functions
to take in both sync and async functions for callbacks and sequential
blocking with the semaphore.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-21 21:39:45 -05:00
Aaron Veden
f53c98db94
Templates: Added automatic detection of chat templates from tokenizer_config.json
2023-12-20 22:45:55 -08:00
kingbri
bee758dae9
Config: Clarify rope parameters
...
Blank = automatic calculation of alpha value.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-20 21:15:06 -05:00
kingbri
5728b9fffb
Model: Don't error out if a generation is empty
...
When stream is false, the generation can be empty, which means
that there's no chunks present in the final generation array, causing
an error.
Instead, return a dummy value if generation is falsy (empty array
or None)
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-20 00:51:33 -05:00
kingbri
ab10b263fd
Model: Add override base seq len
...
Some models (such as mistral and mixtral) set their base sequence
length to 32k due to assumptions of support for sliding window
attention.
Therefore, add this parameter to override the base sequence length
of a model which helps with auto-calculation of rope alpha.
If auto-calculation of rope alpha isn't being used, the max_seq_len
parameter works fine as is.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-20 00:45:39 -05:00
Brian Dashore
5368ed7b64
Merge pull request #31 from veryamazinglystupid/main
...
cuda -> 12, pydantic error fixed.
2023-12-20 00:04:51 -05:00
kingbri
5fbb37405f
Colab: Remove the pydantic hotfix
...
Requirements.txt is now pinned to install pydantic >= 2.0.0
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-20 00:01:58 -05:00
kingbri
c9e43e51aa
API: Add route for draft model list
...
Does the same thing as model list except with draft models.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 23:45:53 -05:00
kingbri
ce2602df9a
Model: Fix max seq len handling
...
Previously, the max sequence length was overriden by the user's
config and never took the model's config.json into account.
Now, set the default to 4096, but include config.prepare when
selecting the max sequence length. The yaml and API request
now serve as overrides rather than parameters.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 23:37:52 -05:00
kingbri
d3246747c0
Templates: Attempt loading from model config
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 22:58:47 -05:00
kingbri
da69ad8cd3
Requirements: Pin versions for some dependencies
...
Pydantic and Jinja2 need pinned versions.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 21:48:04 -05:00
kingbri
1fd38c61de
API: Remove model check dependency for lora list
...
This isn't needed for listing stuff.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 21:35:29 -05:00
veryamazinglystupid
12bf7a0174
fix the colab, pydantic error
...
:3
2023-12-19 19:46:57 +05:30
kingbri
0a144688c6
Templates: Add clarity statements
...
Lets the user know if a file not found (OSError) occurs and prints
the applied template on model load.
Also fix some remaining references to fastchat.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-19 08:13:04 -05:00
kingbri
0d76ed9b8b
Revert "Start: Add an argument parser to batch file"
...
This reverts commit 097c298c39 .
2023-12-19 00:01:27 -05:00
kingbri
45e2987622
Start: Fix batch file condition
...
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-18 23:57:30 -05:00
kingbri
097c298c39
Start: Add an argument parser to batch file
...
Used for future arguments.
Signed-off-by: kingbri <bdashore3@proton.me >
2023-12-18 23:53:47 -05:00