mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-11 14:30:02 +00:00
PP-512 goes to 533 t/s up from 455. TG-128 @ 2 threads goes to 16.6 t/s up from 14.2. However, we seem to have a bottleneck somewhere as TG saturates at 8 threads.