Files
ik_llama.cpp/ggml
Iwan Kawrakow 2c2f728afc Adding BF16 support for AVX2
PP performance is the same as fp16 (~153 t/s on Ryzen-5975WX),
but TG is quite a bit lower (3.65 t/s vs 4.72 t/s at 8 threads).
Why?
2025-01-22 08:37:48 +02:00
..
2024-07-27 07:55:01 +02:00
2025-01-22 08:37:48 +02:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00