Files
ik_llama.cpp/ggml
Kawrakow eae584dc98 Faster R4 quants on Zen4 (#139)
* q3_k_r4: faster Zen4

* q3_k_r4: faster Zen4

256.2 -> 272.7 t/s for PP-512

* q6_k_r4: faster Zen4

243.2 -> 261.3 t/s for PP-512

* q4_k_r4: slightly faster Zen4

262.4 t/s -> 268.1 t/s

* q5_k_r4: slightly faster Zen4

248.3 t/s -> 256.7 t/s

* iq4_xs_r4: slightly faster Zen4

256.8 t/s -> 272.0 t/s

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2024-12-13 15:47:59 +01:00
..
2024-07-27 07:55:01 +02:00
2024-12-12 16:04:20 +01:00
2024-12-13 15:47:59 +01:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00