Files
ik_llama.cpp/iqk_mul_mat.cpp
Iwan Kawrakow 309e32405f iqk_mul_mat: AVX2 implementation for iq2_xs
We get 2.19X for PP-512 (118.9 t/s). TG is mostly OK
(slightly better @ 4 threads, slightly worse @ 16 threads).
2024-06-22 12:02:49 +03:00

141 KiB