Files
tabbyAPI/backends
kingbri 116cf56c87 Model: Auto-round cache size on init
Cache size must be a multiple of 256 to work properly in ExllamaV2.
Take the config value and set the cache size to one multiple above
the remainder of the cache size divided by 256.

This is because cache size can never be lower than max_seq_len.
If max_seq_len isn't a multiple of 256, this method will never
yield a number that's lower than max_seq_len since it's no longer
a source of truth.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-26 21:24:54 -04:00
..