Files
ik_llama.cpp/ggml/src/iqk
Kawrakow 2572d16399 Fix q8_0 repacking issues on AVX2 (#708)
Q8_0 needs Q0_0_X4, but Q8_0_R8 needs Q8_2_X4.
So, if we decide to repack a Q8_0 MoE tensor to Q8_0_R8,
iqk_moe_fused_mul_unary fails because the activations were
prepared as Q0_0_X4, but we now need Q8_2_X4.

For now a simple fix: just take the slow path, do not repack.

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-08-19 19:49:58 +03:00
..
2025-04-21 09:13:46 +02:00
2025-08-09 08:40:18 +03:00
2025-08-09 08:40:18 +03:00
2025-08-09 08:40:18 +03:00