Files
ik_llama.cpp/ggml
Kawrakow 1b05210904 IQ4_KSS improvements (#642)
* iq4_kss: slightly better quantization

* iq4_kss: CUDA MMQ

* iq4_kss: repack/convert to q8_k_r8 (AVX2)

* iq4_kss: repack/convert to q8_k_r8 (NEON)

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-07-23 20:50:57 +02:00
..
2024-07-27 07:55:01 +02:00
2025-07-23 20:50:57 +02:00
2024-07-27 07:55:01 +02:00