Config: Expose auto GPU split reserve config

The GPU reserve is used as a VRAM buffer to prevent GPU overflow
when automatically deciding how to load a model on multiple GPUs.
Make this configurable.

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri
2024-02-08 22:08:51 -05:00
parent 43bba526bf
commit 2f568ff573
3 changed files with 26 additions and 10 deletions

View File

@@ -76,6 +76,10 @@ model:
# NOTE: Not parsed for single GPU users
#gpu_split_auto: True
# Reserve VRAM used for autosplit loading (default: 96 MB on GPU 0)
# This is represented as an array of MB per GPU used
#autosplit_reserve: [96]
# An integer array of GBs of vram to split between GPUs (default: [])
# NOTE: Not parsed for single GPU users
#gpu_split: [20.6, 24]