mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-25 15:44:10 +00:00
PP-512 is now 296 t/s. TG-128 is ~20% faster than iq2_bn for 1 thread, but saturates to about the same 93 t/s at 8 threads.
PP-512 is now 296 t/s. TG-128 is ~20% faster than iq2_bn for 1 thread, but saturates to about the same 93 t/s at 8 threads.