Files
ik_llama.cpp/examples/quantize/quantize.cpp
Iwan Kawrakow 93a85c62bb q8_k_r8: fastest matrix multiplication known to human kind
We get PP-512(LLaMA-3.1-8B) = 370 t/s on a Ryzen-7950X!
2024-12-13 18:21:08 +02:00

24 KiB