ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-05-26 07:58:42 +00:00

Files

Kawrakow 90e53a0b8b Adding IQ5_KS - 5.25 bpw quants (#422 )

* iq5_ks: basics

* iq5_ks: quantize

* iq5_ks: CUDA dequantize works

* iq5_ks: dot product works on CUDA

* iq5_ks: MMQ works

* iq5_ks: Zen4

* iq5_ks: AVX2

But is is not quite right, just like iq4_k, iq5_k, iq6_k, iq4_ks.
All these need fixing on AVX2.

* iq5_ks: NEON

* iq5_ks: Metal dequantize

* iq5_ks: Metal dot product

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-05-15 16:02:39 +03:00

llama.h

Adding IQ5_KS - 5.25 bpw quants (#422 )

2025-05-15 16:02:39 +03:00