ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-26 09:29:27 +00:00

Files

Iwan Kawrakow 78411343cc New iq4_kt: AVX2 dot product finally works

We get 13.6 t/s vs 8.4 t/s with the f16 trellis and f32 arithmetic.
Still somewhat slower than other quants, but no longer pathetic.

2025-06-18 15:34:27 +03:00

2024-07-27 07:55:01 +02:00

2025-06-08 17:27:00 +03:00

2025-06-18 15:34:27 +03:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2025-06-12 19:25:11 +03:00