mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-07 12:30:08 +00:00
We get PP-512(LLaMA-3.1-8B) = 106.2 t/s. TG-128 is 36.02 t/s, which is ~10% higher than q2_K_S.
We get PP-512(LLaMA-3.1-8B) = 106.2 t/s. TG-128 is 36.02 t/s, which is ~10% higher than q2_K_S.