mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-09 13:30:17 +00:00
We get PP-512 = 167 t/s for L3-8B without interleaving! We do the interleaving on the fly, so I wonder if this could be done for other quants as well.