Files
ik_llama.cpp/ggml
Iwan Kawrakow 366441bd75 iq3_k: faster CUDA dot product
138 t/s for LLaMA-3.1-8B, which is almost on par with iq3_s.
2024-07-30 17:18:31 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-30 16:11:25 +03:00
2024-07-30 17:18:31 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00