mirror of
https://github.com/theroyallab/tabbyAPI.git
synced 2026-03-14 15:57:27 +00:00
Model: Correct exl3 generation, add concurrency, and cleanup
Fixes application of sampler parameters by adding a new sampler builder interface. Also expose the generator class-wide and add wait_for_jobs. Finally, allow inline loading to specify the backend. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
@@ -163,8 +163,10 @@ class ModelConfig(BaseConfigModel):
|
||||
"Example: ['max_seq_len', 'cache_mode']."
|
||||
),
|
||||
)
|
||||
|
||||
# Defaults to exllamav2 in common/model.py
|
||||
backend: Optional[str] = Field(
|
||||
"exllamav2",
|
||||
None,
|
||||
description=(
|
||||
"Backend to use for this model (default: exllamav2)\n"
|
||||
"Options: exllamav2, exllamav3",
|
||||
|
||||
Reference in New Issue
Block a user