mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-25 15:44:10 +00:00
PP-512(LLaMA-3.1-8B) = 78 t/s vs 57.6 t/s for q6_K. TG-128 is slightly lower rthan q6_K for low number of threads, becomes very slightly better at 8 threads.