Files
ik_llama.cpp/ggml
Iwan Kawrakow 0e8cfb3d78 FA: slightly faster V*softmax(K*Q)) on Zen4
We now get 130.9 t/s for a context of 32k tokens.
2025-01-18 08:35:01 +02:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00