Model: Fix max seq len handling

Previously, the max sequence length was overriden by the user's config and never took the model's config.json into account. Now, set the default to 4096, but include config.prepare when selecting the max sequence length. The yaml and API request now serve as overrides rather than parameters. Signed-off-by: kingbri <bdashore3@proton.me>
2026-04-26 09:18:53 +00:00 · 2023-12-19 23:37:52 -05:00
parent d3246747c0
commit ce2602df9a
3 changed files with 17 additions and 6 deletions
--- a/config_sample.yml
+++ b/config_sample.yml
@@ -37,8 +37,8 @@ model:

  # The below parameters apply only if model_name is set

-  # Maximum model context length (default: 4096)
-  max_seq_len: 4096
+  # Override maximum model context length (default: None)
+  max_seq_len:

  # Automatically allocate resources to GPUs (default: True)
  gpu_split_auto: True