Files
ik_llama.cpp/ggml
Iwan Kawrakow cf87ad6923 Fix q8_0 repacking issues on AVX2
Q8_0 needs Q0_0_X4, but Q8_0_R8 needs Q8_2_X4.
So, if we decide to repack a Q8_0 MoE tensor to Q8_0_R8,
iqk_moe_fused_mul_unary fails because the activations were
prepared as Q0_0_X4, but we now need Q8_2_X4.

For now a simple fix: just take the slow path, do not repack.
2025-08-19 18:43:36 +03:00
..
2024-07-27 07:55:01 +02:00
2025-08-19 18:43:36 +03:00
2024-07-27 07:55:01 +02:00