ik_llama.cpp/iqk_mul_mat.cpp at e05cca9ef652eee7b42927485a3821b14e3c565f

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-28 10:21:48 +00:00

Files

Iwan Kawrakow e05cca9ef6 bitnet(scale in a separate tensor): CPU improvements

Arrange Q8 quants in blocks of 128 and adapt iqk_mul_mat
to deal with that. This improves PP speef by a few percent.

2024-06-22 12:02:52 +03:00

View Raw