Config: Update Q4 in comments

Wasn't present when the option was added. Signed-off-by: kingbri <bdashore3@proton.me>
2026-03-14 15:57:27 +00:00 · 2024-03-17 01:04:12 -04:00
parent 14d8ec2007
commit 7abbac098a
1 changed files with 2 additions and 1 deletions
--- a/config_sample.yml
+++ b/config_sample.yml
@@ -103,7 +103,8 @@ model:
  # Disable Flash-attention 2. Set to True for GPUs lower than Nvidia's 3000 series. (default: False)
  #no_flash_attention: False

-  # Enable 8 bit cache mode for VRAM savings (slight performance hit). Possible values FP16, FP8. (default: FP16)
+  # Enable 8 bit cache mode for VRAM savings (slight performance hit).
+  # Possible values FP16, FP8, Q4. (default: FP16)
  #cache_mode: FP16

  # Set the prompt template for this model. If empty, chat completions will be disabled. (default: Empty)