Files
ik_llama.cpp/ggml
Iwan Kawrakow dd0b08d1d8 iq2_tn: AVX512
Just reusing the k-quants template gets us to PP-512 = 376 t/s,
TG-128 = 47.6 t/s for TriLM-3.9B.
2024-08-05 14:53:49 +03:00
..
2024-07-27 07:55:01 +02:00
2024-08-05 14:53:49 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00