Brian Dashore
2d221832fb
Merge pull request #201 from theroyallab/lmfe_fix
...
Fix LMFE
2024-09-14 22:11:11 -04:00
turboderp
318c425d84
Bump exllamav2 to 0.2.2
2024-09-14 21:43:26 +02:00
turboderp
c66fe8e947
Grammar: Add custom ExLlamaV2TokenEnforcerFilter class
2024-09-14 21:42:53 +02:00
Brian Dashore
a2b4e3f21f
Merge pull request #192 from SecretiveShell/prune-docker-size
...
debloat docker build
2024-09-11 00:13:16 -04:00
kingbri
e00eb09ef3
OAI: Add cancellation with inline load
...
When the request is cancelled, cancel the load task. In addition,
when checking if a model container exists, also check if the model
is fully loaded.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-11 00:08:55 -04:00
kingbri
b9e5693c1b
API + Model: Apply config.yml defaults for all load paths
...
There are two ways to load a model:
1. Via the load endpoint
2. Inline with a completion
The defaults were not applying on the inline load, so rewrite to fix
that. However, while doing this, set up a defaults dictionary rather
than comparing it at runtime and remove the pydantic default lambda
on all the model load fields.
This makes the code cleaner and establishes a clear config tree for
loading models.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 23:35:35 -04:00
kingbri
7baef05b49
Transformers Utils: Fix file read
...
Use asynchronous JSON reading
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 22:41:39 -04:00
kingbri
62beb2b1c8
Config: Fetch the correct dict for draft_model and lora
...
Fixed fetching from the merged config instead of the sub-config
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 21:30:53 -04:00
kingbri
aa832b8627
Tree: Format
...
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 20:57:13 -04:00
kingbri
5e8ff9a004
Tree: Fix classmethod usage
...
Instead of self, use cls which passes a type of the class.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 20:52:29 -04:00
kingbri
2c3bc71afa
Tree: Switch to asynchronous file handling
...
Using aiofiles, there's no longer a possiblity of blocking file operations
that can hang up the event loop. In addition, partially migrate
classes to use asynchronous init instead of the normal python magic method.
The only exception is config, since that's handled in the synchonous
init before the event loop starts.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 16:45:14 -04:00
kingbri
54bfb770af
API: Fix template switch endpoint
...
Forwards a Path instead of a string and adheres to the new pathfinding
system.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 12:22:07 -04:00
kingbri
810cd40016
Start: Broadcast start_options only on first-time run
...
Prevents the save from occurring multiple times for no reason.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-10 12:19:54 -04:00
Brian Dashore
0586fc17cc
Merge pull request #197 from atisharma/issue_196
...
Fix tabby_config.py _from_file
2024-09-10 09:14:44 -04:00
Ati Sharma
a370aeb15f
Fix tabby_config.py _from_file
...
Update tabby_config.py to fix issue #196
2024-09-09 09:19:12 +01:00
Brian Dashore
c11461e22f
Merge pull request #195 from Cohee1207/fix-config-name
...
Properly specify config "inline_model_loading" value in the error message
2024-09-08 22:52:42 -04:00
kingbri
cf97113868
Dependencies: Update Exllamav2
...
v0.2.1
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-08 21:12:31 -04:00
Cohee
63476041d1
Properly specify config value in the error message
2024-09-08 22:02:49 +03:00
kingbri
d6ad17097c
Templates: Remove whitespace from metadata
...
Apparently setting variables also adds extraneous whitespace before
the template itself.
Doing {%- set stop_strings = ["string1"] -%} fixes this issue.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-08 12:36:36 -04:00
kingbri
776bfd817d
Templates: Migrate tool calling templates to folder
...
Mirrors the llm-prompt-templates repo
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-08 12:28:38 -04:00
kingbri
df11890851
Templating: Add loopcontrols extension
...
Inbuilt jinja extension to allow for break and continue in loops.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-08 12:21:42 -04:00
kingbri
dffceab777
Sampling: Link dry_range
...
Was not linked in the gen params dict.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-08 01:55:52 -04:00
Brian Dashore
0c74cd80ea
Merge pull request #191 from SecretiveShell/list-draft-models
...
fix function arguments for get_model_list
2024-09-07 22:29:05 -04:00
kingbri
acd3eb1140
Model: Add model folder template support
...
Like tabby_config.yml in the model's folder, a custom template can
also be provided via tabby_template.yml in addition to the existing
templates folder. The config.yml always takes priority.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 22:20:38 -04:00
kingbri
b576a2f116
API: Bump sent koboldcpp version
...
Unlock DRY on lite UI.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 21:45:51 -04:00
kingbri
9c4a0e650f
Sampling: Fix override for DRY sequence breakers
...
The common type should be an array of strings.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 21:38:50 -04:00
TerminalMan
4b11cabbec
debloat docker build
2024-09-08 00:02:00 +01:00
TerminalMan
d57a3b459c
fix function arguments for get_model_list
2024-09-07 18:27:10 +01:00
kingbri
4f5ca7a4c7
Sampling: Update overrides and params
...
Re-order to make more sense.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 12:48:59 -04:00
kingbri
ae37f3f332
Sampling: Update DRY
...
Switch to new parameters and remove dry_max_ngram as that's not supposed
to be changed.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 12:39:14 -04:00
kingbri
05c3f1194f
Sampling: Add rudimentary DRY support
...
Adds DRY support based on the current exl2 dev API. Only change for
optimization is dry_max_ngram instead of using a closed range.
Currently, DRY range is aliased to dry_max_ngram.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-07 00:48:42 -04:00
kingbri
d34756dc98
Tree: Format
...
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 18:05:59 -04:00
kingbri
2f45e978c5
API: Fix merge overwrite
...
The completions utils did not take the new imports.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 18:04:53 -04:00
Brian Dashore
ec7f64d530
Merge pull request #185 from SecretiveShell/refactor-config-loading
...
Refactor config loading
2024-09-05 18:00:32 -04:00
kingbri
1c9991f79e
Config: Format and organize
...
Rename some methods and change comments.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-05 17:59:18 -04:00
Jake
cb91670c7a
fix command line args
...
- move to a complet class singleton to avoid propagation errors
- remove legacy load confing precedure
2024-09-05 15:33:00 +01:00
kingbri
98768bfa30
Docker: Re-add build block
...
If a user wants to build from source, let them. But the default
should fetch from the package registry.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 23:39:06 -04:00
kingbri
93872b34d7
Config: Migrate to global class instead of dicts
...
The config categories can have defined separation, but preserve
the dynamic nature of adding new config options by making all the
internal class vars as dictionaries.
This was necessary since storing global callbacks stored a state
of the previous global_config var that wasn't populated.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 23:18:47 -04:00
Brian Dashore
3bc9bd09a0
Merge pull request #180 from SecretiveShell/main
...
make docker-compose use prebuilt images
2024-09-04 21:48:18 -04:00
Brian Dashore
8524999284
Merge pull request #184 from SecretiveShell/Infinity-Embed-TODO
...
Complete conditional infinity import TODO
2024-09-04 21:47:49 -04:00
Brian Dashore
03ff472149
Merge pull request #130 from bartowski1182/main
...
WIP: Add 'model' argument to /v1/chat/completions to load a new model on the fly
2024-09-04 21:46:41 -04:00
kingbri
9c10789ca1
API: Error on invalid key permissions and cleanup format
...
If a user requesting a model change isn't admin, error.
Better to place the load function before the generate functions.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-04 21:44:14 -04:00
Jake
e772fa2981
Switch to internal dict merge implementation
...
- remove deepmerge dependency
- fix ruff formatting
2024-09-04 16:27:28 +01:00
Jake
42a42caf43
remove logging
...
- remove logging statements
- format code with ruff
2024-09-04 16:14:09 +01:00
Jake
ac4d9bba1c
refactor config functions
...
- improve DRY
2024-09-04 12:49:22 +01:00
Jake
fa6404a95a
refactor config loading
...
- improve DRY
- alter logging
- allow extensibility
- add foundation for environment variables as config
2024-09-04 12:22:49 +01:00
kingbri
21f14d4318
API: Update inline load
...
- Add a config flag
- Migrate support to /v1/completions
- Unify the load function
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-03 23:37:28 -04:00
kingbri
dd30d6592a
Merge branch 'main' of https://github.com/theroyallab/tabbyapi into inline
2024-09-03 18:03:17 -04:00
kingbri
8854269121
API: Fix current model list return
...
Check if the container actually exists in the match before returning
the value of the directory.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-09-01 10:54:01 -04:00
kingbri
4bf1a71d7b
Model: Fix model override application for draft args
...
These have to be merged beforehand and the updated version needs to be
re-fetched. It's possible to prevent the fetch of draft_args in the
beginning of init.
Signed-off-by: kingbri <bdashore3@proton.me >
2024-08-31 22:59:56 -04:00