mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-04-28 02:11:50 +00:00
We get PP-512 = 9600 t/s, TG-128 = 234 t/s (but we need to use 8 CPU threads, else results are lower, so clearly there is something being computed on the CPU). PP-512 is very close to PP-512(fp16) = 9800 t/s
119 KiB
119 KiB