mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-13 15:30:03 +00:00
* This seems slightly faster for IQ2_KT, IQ3_KT TG * This looks better for iq4_kt TG * WIP * Cleanup * With fancy simd also set func16 * Enable next_128() also on AVX2 Despite having just 16 vector registers it is still faster. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>