Files
ik_llama.cpp/include
Iwan Kawrakow fad847d753 Adding q5_0_r4
We get PP-512(LLaMA-3.1-8B) = 256.7 t/s on a Ryzen-7950X.
We even get TG-128 improvement to 11.7 t/s from 11.1 t/s.
2024-12-03 11:29:57 +02:00
..
2024-12-03 11:29:57 +02:00