Files
ik_llama.cpp/ggml
Iwan Kawrakow 1469b22035 iq4_kss: AVX2
Bad, but better than I expected.
PP-512(LLaMA-3.1-8B) = 167 t/s on the Ryzen-5950X.
I.e., with 32 AVX2 threads we get the performance of
16 Zen4 threads.
2024-10-16 14:14:00 +03:00
..
2024-07-27 07:55:01 +02:00
2024-10-16 14:14:00 +03:00
2024-10-16 14:14:00 +03:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00