ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-05-11 08:30:19 +00:00

Files

Iwan Kawrakow 07d3b4caec iq6_k: CUDA dot product

90.2 t/s for LLaMA-3.1-8B. Q6_K gives 91.2 t/s, so we are good.

2024-08-07 19:24:09 +03:00

2024-07-27 07:55:01 +02:00

2024-08-07 15:24:16 +03:00

iq6_k: CUDA dot product

2024-08-07 19:24:09 +03:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2024-07-27 07:55:01 +02:00