Files
ik_llama.cpp/include
Iwan Kawrakow 3da565c9c9 iq1_kt: CUDA dequantize
Testing with LlaMA-3.1-8B-Instruct, we get almost the same PPL
as iq2_xxs, so about 0.2 bpw fewer bits for the same quality.
2025-07-16 16:55:20 +03:00
..
2025-07-16 16:55:20 +03:00