Files
ik_llama.cpp/ggml/src
Kawrakow 6a56d5075d Faster prompt processing for IQ2_KS, IQ2_K, IQ2_K_R4 (#593)
* cuda: faster MMQ for iq2_ks, iq2_k, iq2_k_r4

* Lookup is still beter for MMQ if we get 4 values at once

* Minor

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-07-08 19:44:48 +02:00
..
2025-07-02 09:27:47 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-07-02 16:11:56 +02:00
2024-07-27 07:55:01 +02:00
2025-07-02 09:27:47 +02:00
2024-08-12 15:14:32 +02:00
2025-07-02 09:27:47 +02:00
2025-07-02 09:27:47 +02:00
2025-07-02 09:27:47 +02:00
2025-07-02 09:27:47 +02:00