Files
ik_llama.cpp/ggml
Iwan Kawrakow e558992f0c New iq4_kt trellis
The new trellis generates int8_t values via
sum_as_uint8_t[(ka * idx + kb) & 0x3f33f3f3f] - 126.
CUDA dequantize works.
AVX2 case Ny > 32 works, and we get 273 t/s for L3-8B.
PPL is on par or even slightly lower than original QTIP trellis.
2025-06-18 15:34:25 +03:00
..
2024-07-27 07:55:01 +02:00
2025-06-08 17:27:00 +03:00
2025-06-18 15:34:25 +03:00
2024-07-27 07:55:01 +02:00