Files
ik_llama.cpp/ggml
Kawrakow 97c34f4056 Faster prompt processing for IQ2_KS, IQ2_K, IQ2_K_R4 (#593)
* cuda: faster MMQ for iq2_ks, iq2_k, iq2_k_r4

* Lookup is still beter for MMQ if we get 4 values at once

* Minor

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-07-08 19:44:48 +02:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00