ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-03 18:40:14 +00:00

Files

Iwan Kawrakow 050bdfa101 iq6_k: CUDA dot product

90.2 t/s for LLaMA-3.1-8B. Q6_K gives 91.2 t/s, so we are good.

2024-08-09 16:00:31 +02:00

2024-07-27 07:55:01 +02:00

2024-08-09 16:00:31 +02:00

iq6_k: CUDA dot product

2024-08-09 16:00:31 +02:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2024-07-27 07:55:01 +02:00