Files
ik_llama.cpp/ggml
Iwan Kawrakow 4d730ebfd9 iq2_bn_r4: use AVX2 implementation on Zen4 for matrix x vector
It is faster - we get 29.6 t/s at 1 thread vs 25.9 t/s for iq2_bn.
2024-12-06 07:12:16 +02:00
..
2024-07-27 07:55:01 +02:00
2024-12-05 15:18:33 +02:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00