Files
ik_llama.cpp/include/llama.h
Iwan Kawrakow 93a85c62bb q8_k_r8: fastest matrix multiplication known to human kind
We get PP-512(LLaMA-3.1-8B) = 370 t/s on a Ryzen-7950X!
2024-12-13 18:21:08 +02:00

59 KiB