kingbri
9c4a0e650f
Sampling: Fix override for DRY sequence breakers
...
The common type should be an array of strings.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 21:38:50 -04:00
TerminalMan
4b11cabbec
debloat docker build
2024-09-08 00:02:00 +01:00
TerminalMan
d57a3b459c
fix function arguments for get_model_list
2024-09-07 18:27:10 +01:00
kingbri
4f5ca7a4c7
Sampling: Update overrides and params
...
Re-order to make more sense.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 12:48:59 -04:00
kingbri
ae37f3f332
Sampling: Update DRY
...
Switch to new parameters and remove dry_max_ngram as that's not supposed
to be changed.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 12:39:14 -04:00
kingbri
05c3f1194f
Sampling: Add rudimentary DRY support
...
Adds DRY support based on the current exl2 dev API. Only change for
optimization is dry_max_ngram instead of using a closed range.
Currently, DRY range is aliased to dry_max_ngram.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 00:48:42 -04:00
TerminalMan
420fd84f6b
add env var loading automation
...
- load config from env vars (eg. TABBY_NETWORK_HOST)
- remove print statements
- improve command line args automation
2024-09-06 15:05:48 +01:00
TerminalMan
8e9344642e
patch pydantic config into old config
...
- convert pydantic to dict to avoid errors with current files
- fix formatting
2024-09-06 14:31:28 +01:00
Jake
36e991c16e
automate arg parse
...
- generate arg parser dynamically
- remove legavy parser code
2024-09-06 00:27:53 +01:00
kingbri
d34756dc98
Tree: Format
...
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 18:05:59 -04:00
kingbri
2f45e978c5
API: Fix merge overwrite
...
The completions utils did not take the new imports.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 18:04:53 -04:00
Brian Dashore
ec7f64d530
Merge pull request #185 from SecretiveShell/refactor-config-loading
...
Refactor config loading
2024-09-05 18:00:32 -04:00
kingbri
1c9991f79e
Config: Format and organize
...
Rename some methods and change comments.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 17:59:18 -04:00
Jake
362b8d5818
config is now backed by pydantic (WIP)
...
- add models for config options
- add function to regenerate config.yml
- replace references to config with pydantic compatible references
- remove unnecessary unwrap() statements
TODO:
- auto generate env vars
- auto generate argparse
- test loading a model
2024-09-05 18:04:56 +01:00
Jake
cb91670c7a
fix command line args
...
- move to a complet class singleton to avoid propagation errors
- remove legacy load confing precedure
2024-09-05 15:33:00 +01:00
kingbri
98768bfa30
Docker: Re-add build block
...
If a user wants to build from source, let them. But the default
should fetch from the package registry.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 23:39:06 -04:00
kingbri
93872b34d7
Config: Migrate to global class instead of dicts
...
The config categories can have defined separation, but preserve
the dynamic nature of adding new config options by making all the
internal class vars as dictionaries.
This was necessary since storing global callbacks stored a state
of the previous global_config var that wasn't populated.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 23:18:47 -04:00
Brian Dashore
3bc9bd09a0
Merge pull request #180 from SecretiveShell/main
...
make docker-compose use prebuilt images
2024-09-04 21:48:18 -04:00
Brian Dashore
8524999284
Merge pull request #184 from SecretiveShell/Infinity-Embed-TODO
...
Complete conditional infinity import TODO
2024-09-04 21:47:49 -04:00
Brian Dashore
03ff472149
Merge pull request #130 from bartowski1182/main
...
WIP: Add 'model' argument to /v1/chat/completions to load a new model on the fly
2024-09-04 21:46:41 -04:00
kingbri
9c10789ca1
API: Error on invalid key permissions and cleanup format
...
If a user requesting a model change isn't admin, error.
Better to place the load function before the generate functions.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 21:44:14 -04:00
Jake
e772fa2981
Switch to internal dict merge implementation
...
- remove deepmerge dependency
- fix ruff formatting
2024-09-04 16:27:28 +01:00
Jake
42a42caf43
remove logging
...
- remove logging statements
- format code with ruff
2024-09-04 16:14:09 +01:00
Jake
ac4d9bba1c
refactor config functions
...
- improve DRY
2024-09-04 12:49:22 +01:00
Jake
fa6404a95a
refactor config loading
...
- improve DRY
- alter logging
- allow extensibility
- add foundation for environment variables as config
2024-09-04 12:22:49 +01:00
kingbri
21f14d4318
API: Update inline load
...
- Add a config flag
- Migrate support to /v1/completions
- Unify the load function
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-03 23:37:28 -04:00
kingbri
dd30d6592a
Merge branch 'main' of https://github.com/theroyallab/tabbyapi into inline
2024-09-03 18:03:17 -04:00
kingbri
8854269121
API: Fix current model list return
...
Check if the container actually exists in the match before returning
the value of the directory.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-01 10:54:01 -04:00
kingbri
4bf1a71d7b
Model: Fix model override application for draft args
...
These have to be merged beforehand and the updated version needs to be
re-fetched. It's possible to prevent the fetch of draft_args in the
beginning of init.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
kingbri
4aebe8a2a5
Config: Use an explicit "auto" value for rope_alpha
...
Using "auto" for rope alpha removes ambiguity on how to explicitly
enable automatic rope calculation. The same behavior of None -> auto
calculate still exists, but can be overwritten if a model's tabby_config.yml
includes `rope_alpha`.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
kingbri
a96fa5f138
API: Don't fallback to default values on model load request
...
It's best to pass them down the config stack.
API/User config.yml -> model config.yml -> model config.json -> fallback.
Doing this allows for seamless flow and yielding control to each
member in the stack.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
kingbri
4452d6f665
Model: Add support for overridable model config.yml
...
Like config.json in a model folder, providing a tabby_config.yml
will serve as a layer between user provided kwargs and the config.json
values.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
kingbri
dd55b99af5
Model: Store directory paths
...
Storing a pathlib type makes it easier to manipulate the model
directory path in the long run without constantly fetching it
from the config.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
kingbri
523709741c
Model: Reorder how configs are set up
...
Initialize the Exllama classes first then add user-specific params.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00
TerminalMan
43104e0d19
Complete conditional infinity import TODO
...
- add logging
- change declaration order
2024-08-31 21:48:43 +01:00
kingbri
21712578cf
API: Add allowed_tokens support
...
This is the opposite of banned tokens. Exllama specific implementation
of #181 .
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-29 21:44:42 -04:00
kingbri
10d9419f90
Model: Add BOS token to prompt logs
...
If add_bos_token is enabled, the BOS token gets appended to the logged
prompt if logging is enabled.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-29 21:15:09 -04:00
TerminalMan
48d7674316
make docker-compose use prebuilt images
...
- Docker compose uses the prebuilt images produced by the GitHub action added in 872eeed581
2024-08-29 00:50:01 +01:00
kingbri
96fce34253
Dependencies: Update ExllamaV2
...
v0.2.0
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-28 18:34:00 -04:00
kingbri
a00d972054
Server: Remove unused comments
...
Leftovers from the new API server log system.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-27 21:45:51 -04:00
kingbri
4958c06813
Model: Remove and format comments
...
The comment in __init__ was outdated and all the kwargs are the
config options anyways.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-27 21:43:40 -04:00
TerminalMan
80198ca056
API: Add /v1/health endpoint ( #178 )
...
* Add healthcheck
- localhost only /healthcheck endpoint
- cURL healthcheck in docker compose file
* Update Healthcheck Response
- change endpoint to /health
- remove localhost restriction
- add docstring
* move healthcheck definition to top of the file
- make the healthcheck show up first in the openAPI spec
* Tree: Format
2024-08-27 21:37:41 -04:00
Amgad Hasan
872eeed581
Build and push docker image ( #171 )
...
* Create docker-image.yml
* Update docker-image.yml
2024-08-26 16:18:10 -04:00
Ben Gitter
045bc98333
Remove rouge print statements within chat_completion.py ( #174 )
...
* rouge prompt print
* remove print pt2
* Print Removal Final
2024-08-23 21:28:37 -04:00
turboderp
fe3253f3a9
Model: Account for tokenizer lazy init
2024-08-23 23:51:53 +02:00
turboderp
a676c4bf38
Model: Formatting
2024-08-23 11:15:30 +02:00
turboderp
a3733caeda
Model: Fix draft model cache initialization
2024-08-23 11:08:49 +02:00
kingbri
364032e39e
Config: Remove developement flag from tensor parallel
...
Exists in stable ExllamaV2 version.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-22 14:15:19 -04:00
kingbri
565b0300d6
Dependencies: Update Exllamav2
...
v0.1.9
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-22 14:15:19 -04:00
kingbri
078fbf1080
Model: Add quantized cache support for tensor parallel
...
Newer versions of exl2 v1.9-dev have quantized cache implemented. Add
those APIs.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-22 14:15:19 -04:00