ik_llama.cpp/iqk-quantize.cpp at f200d36a7fee961e9f1a0693d4a7a10f42de225d

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-11 22:40:01 +00:00

Files

Kawrakow 39982764d7 Bitnet: 2.25 bpw version

Just scaler and AVX2 for now.
PP-512 is even faster (325 t/s on the Ryzn-7950X, 404 t/s on
Ryzen-5975WX). We lose ~6-7% for TG due to being memory bound and
the model being 10% larger.

2024-06-22 12:02:52 +03:00

14 KiB

Raw Blame History

View Raw

14 KiB Raw Blame History

14 KiB

Raw Blame History