Commit Graph

4 Commits

Author SHA1 Message Date
Iwan Kawrakow
f6863cfa1b bitnet: add 2 bpw quantization
The scalar dot product already chieves 37 t/s for TG!
2024-06-22 12:02:51 +03:00
Iwan Kawrakow
765622ff8f Move Q8_K64 quantization to iqk-quantize.cpp and add copyright notice 2024-06-22 12:02:51 +03:00
Iwan Kawrakow
d1c40ff7e2 bitnet: fix scalar dot product
I had forgotten to adjust for the change to q8_K64.
On the M2 I'm getting 10.8 t/s with the scalar version!
2024-06-22 12:02:51 +03:00
Iwan Kawrakow
f20b28558b bitnet: python + llama 2024-06-22 12:02:51 +03:00