Files
ik_llama.cpp/ggml/src
Iwan Kawrakow 050bdfa101 iq6_k: CUDA dot product
90.2 t/s for LLaMA-3.1-8B. Q6_K gives 91.2 t/s, so we are good.
2024-08-09 16:00:31 +02:00
..
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00