mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-21 21:54:10 +00:00
About 4% slower than Q6_K for PP-512, but 10% faster for TG-128. Someone has screwed up Q6_K TG performance on Metal? With the cobntinuous "improvements" in ggml I wouldn't be surprised. Need to look into it later.