mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-04-25 00:49:34 +00:00
This version is finally faster up to 32k tokens. At 32k tokens it bets no-FA by 23%, at 16k by 20%, at 8k by 10%.
This version is finally faster up to 32k tokens. At 32k tokens it bets no-FA by 23%, at 16k by 20%, at 8k by 10%.