Files
ik_llama.cpp/ggml
Kawrakow a9eeef53f3 Fix q8_0 repacking issues on AVX2 (#708)
Q8_0 needs Q0_0_X4, but Q8_0_R8 needs Q8_2_X4.
So, if we decide to repack a Q8_0 MoE tensor to Q8_0_R8,
iqk_moe_fused_mul_unary fails because the activations were
prepared as Q0_0_X4, but we now need Q8_2_X4.

For now a simple fix: just take the slow path, do not repack.

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-08-19 19:49:58 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00