Files
ik_llama.cpp/ggml
Iwan Kawrakow a7fb0fc3cc Experimenting with dequant + f32 GEMM
For iq4_kt this results in a massive PP improvement
from PP512 = ~42 t/s to PP512 = 128 t/s.
2025-05-31 11:08:27 +03:00
..
2024-07-27 07:55:01 +02:00
2025-05-31 11:08:27 +03:00
2024-07-27 07:55:01 +02:00