Files
ik_llama.cpp/ggml
Iwan Kawrakow e323a5bbb6 iq5_ks
180 t/s -> 359 t/s. iq5_ks_r4 is 210 t/s.

PPL is actually lower - 7.4160 vs 7.4494 for LlaMA-3.1-8B-Instruct
2025-06-17 10:44:07 +03:00
..
2024-07-27 07:55:01 +02:00
2025-06-08 17:27:00 +03:00
2025-06-17 10:44:07 +03:00
2024-07-27 07:55:01 +02:00