mirror of
https://github.com/theroyallab/tabbyAPI.git
synced 2026-03-14 15:57:27 +00:00
Model: Remove num_experts_per_token
This shouldn't even be an exposed option since changing it always breaks inference with the model. Let the model's config.json handle it. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
@@ -194,11 +194,6 @@
|
||||
" # NOTE: Only works with chat completion message lists!\n",
|
||||
" prompt_template: {PromptTemplate}\n",
|
||||
"\n",
|
||||
" # Number of experts to use per token. Loads from the model's config.json if not specified (default: None)\n",
|
||||
" # WARNING: Don't set this unless you know what you're doing!\n",
|
||||
" # NOTE: For MoE models (ex. Mixtral) only!\n",
|
||||
" num_experts_per_token: {NumExpertsPerToken}\n",
|
||||
"\n",
|
||||
" # Options for draft models (speculative decoding). This will use more VRAM!\n",
|
||||
" draft:\n",
|
||||
" # Overrides the directory to look for draft (default: models)\n",
|
||||
|
||||
Reference in New Issue
Block a user