ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-24 15:14:10 +00:00

Files

Iwan Kawrakow 3da565c9c9 iq1_kt: CUDA dequantize

Testing with LlaMA-3.1-8B-Instruct, we get almost the same PPL
as iq2_xxs, so about 0.2 bpw fewer bits for the same quality.

2025-07-16 16:55:20 +03:00

llama.h

iq1_kt: CUDA dequantize

2025-07-16 16:55:20 +03:00