Files
ik_llama.cpp/ggml
Iwan Kawrakow 32f8a33f5e iq2_bn_r4: 1st shot at NEON
PP-512 is already faster than iq2_bn (284 t/s vs 246 t/s
for Bitnet-1.58b-3B). TG-128 is ~5% slower.
2024-12-05 15:21:39 +01:00
..
2024-07-27 07:55:01 +02:00
2024-12-05 15:18:33 +02:00
2024-12-05 15:21:39 +01:00
2024-07-27 07:55:01 +02:00
2024-10-04 14:43:26 +03:00