mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-04-24 16:39:45 +00:00
On Zen4, PP-512 goes up from ~260 t/s to 288 t/s for L3-8B. TG-128 reaches max. performance at 2 threads and is slightly higher than 4 interleaved rows (14.48 t/s vs 13.11 t/s @ 2 threads and 14/28 t/s @ 4 threads).