Files
ik_llama.cpp/ggml
Iwan Kawrakow a16f961033 Repack q4_0 and q8_0 to q8_0_R8
q8_0 is fine, but I observe a very significant PPL increase
for q4_0. Best guess: precision loss with the 32 bit <-> 16 bit
scale conversions.
2025-06-18 08:46:47 +03:00
..
2024-07-27 07:55:01 +02:00
2025-06-08 17:27:00 +03:00
2025-06-18 08:46:47 +03:00
2024-07-27 07:55:01 +02:00