Files
ik_llama.cpp/ggml
Iwan Kawrakow 07d3b4caec iq6_k: CUDA dot product
90.2 t/s for LLaMA-3.1-8B. Q6_K gives 91.2 t/s, so we are good.
2024-08-07 19:24:09 +03:00
..
2024-07-27 07:55:01 +02:00
2024-08-07 15:24:16 +03:00
2024-08-07 19:24:09 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00