Files
ik_llama.cpp/ggml
Iwan Kawrakow 0d7885f081 q8_KV: Better Zen4 gemm
We get 225.7 t/s for L3-8B. In comparison q8_0 without
run-tinme-repacking is at 169 t/s.
2025-02-19 10:03:15 +02:00
..
2024-07-27 07:55:01 +02:00
2025-02-19 10:03:15 +02:00
2024-07-27 07:55:01 +02:00