Files
tabbyAPI/backends/exllamav2
kingbri 9a007c4707 Model: Add support for Q4 cache
Add this in addition to 8bit cache and 16bit cache. Passing "Q4" with
the cache_mode request parameter will set this on model load.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-06 00:59:28 -05:00
..
2024-02-24 23:40:11 -05:00
2024-03-06 00:59:28 -05:00
2024-02-24 12:26:08 -05:00