ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-24 15:14:10 +00:00

Files

Iwan Kawrakow 36e9c922b8 iq2_kt - this is better

Using blocks of 32 and 16 bits per group of 8 weights
it beats iq2_xxs in terms of PPL by a significant margin.
It is 0.0625 bpw larger, but even if we go to 15 bits per
group od 8 (so 0.0625 bpw less than iq2_xxs), PPL is still
lower.

2024-11-21 08:16:41 +02:00

CMakeLists.txt

WIP

2024-11-21 08:16:40 +02:00

quantize-stats.cpp

iq2_kt - this is better

2024-11-21 08:16:41 +02:00