Add Q6 and Q8 cache options to eval scripts

This commit is contained in:
turboderp
2024-06-09 02:13:06 +02:00
parent f3596fc0d9
commit 675450d845
3 changed files with 18 additions and 2 deletions

View File

@@ -58,6 +58,10 @@ prefix for the response.
performance.
- **-cq4 / --cache_q4**: Use Q4 cache
- **-cq6 / --cache_q6**: Use Q6 cache
- **-cq8 / --cache_q8**: Use Q8 cache
## MMLU
@@ -83,3 +87,7 @@ the full list of subjects.
performance.
- **-cq4 / --cache_q4**: Use Q4 cache
- **-cq6 / --cache_q6**: Use Q6 cache
- **-cq8 / --cache_q8**: Use Q8 cache