mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-25 15:44:10 +00:00
Pretty good performance - on M2-Max we get PP-512(LLaMA-3.1-8B) = 89.5 t/s TG-128(LLaMA-3.1-8B) = 27.65 t/s
Pretty good performance - on M2-Max we get PP-512(LLaMA-3.1-8B) = 89.5 t/s TG-128(LLaMA-3.1-8B) = 27.65 t/s