Files
ik_llama.cpp/ggml/src/iqk
Kawrakow b5f2f00106 Much faster prompt processing for IQ1_S and IQ1_M on ARM_NEON (#553)
* iq1_s

66.3 t/s -> 168.8 t/s.

* iq1_m

19 t/s -> 163 t/s.

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-06-24 14:21:37 +02:00
..
2025-05-22 10:05:51 +03:00
2025-04-21 09:13:46 +02:00
2025-04-29 07:19:43 +02:00
2025-05-22 10:05:51 +03:00