ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-25 07:34:10 +00:00

Files

Iwan Kawrakow a4ffe2e69e q8_KV: AVX2 gemm/gemv

We get 254 t/s for L3-8B vs 194 t/s for q8_0 without rtr.

2025-02-19 10:03:15 +02:00

2024-07-27 07:55:01 +02:00

2025-02-19 10:03:15 +02:00

q8_KV: AVX2 gemm/gemv

2025-02-19 10:03:15 +02:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2025-02-09 18:59:33 +02:00