Files
ik_llama.cpp/src
Iwan Kawrakow 3da565c9c9 iq1_kt: CUDA dequantize
Testing with LlaMA-3.1-8B-Instruct, we get almost the same PPL
as iq2_xxs, so about 0.2 bpw fewer bits for the same quality.
2025-07-16 16:55:20 +03:00
..
2024-07-27 07:55:01 +02:00
2025-06-19 10:24:53 +03:00
2025-06-19 10:24:53 +03:00
2025-06-19 10:24:53 +03:00
2025-06-19 10:24:53 +03:00
2025-07-16 16:55:20 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00