Files
ik_llama.cpp/include
Iwan Kawrakow 0137264c6f Adding iq2_bn_r4
This Zen4-only implementation achieves PP-512 = 826 t/s (!!!)
for Bitnet-1.58b-3B, up from 620 t/s for iq2_bn.
2024-12-05 15:18:33 +02:00
..
2024-12-05 15:18:33 +02:00