Files
ik_llama.cpp/examples/quantize-stats
Iwan Kawrakow 36e9c922b8 iq2_kt - this is better
Using blocks of 32 and 16 bits per group of 8 weights
it beats iq2_xxs in terms of PPL by a significant margin.
It is 0.0625 bpw larger, but even if we go to 15 bits per
group od 8 (so 0.0625 bpw less than iq2_xxs), PPL is still
lower.
2024-11-21 08:16:41 +02:00
..
WIP
2024-11-21 08:16:40 +02:00
2024-11-21 08:16:41 +02:00