mirror of
https://github.com/turboderp-org/exllamav2.git
synced 2026-04-30 03:01:23 +00:00
Add Q6 and Q8 cache options to eval scripts
This commit is contained in:
@@ -58,6 +58,10 @@ prefix for the response.
|
||||
performance.
|
||||
|
||||
- **-cq4 / --cache_q4**: Use Q4 cache
|
||||
|
||||
- **-cq6 / --cache_q6**: Use Q6 cache
|
||||
|
||||
- **-cq8 / --cache_q8**: Use Q8 cache
|
||||
|
||||
## MMLU
|
||||
|
||||
@@ -83,3 +87,7 @@ the full list of subjects.
|
||||
performance.
|
||||
|
||||
- **-cq4 / --cache_q4**: Use Q4 cache
|
||||
|
||||
- **-cq6 / --cache_q6**: Use Q6 cache
|
||||
|
||||
- **-cq8 / --cache_q8**: Use Q8 cache
|
||||
|
||||
Reference in New Issue
Block a user