mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-21 05:34:08 +00:00
* AVX2 Flash Attention: add ability to use Q8_0 for kv-cache * AVX2 Flash Attention: add ability to use Q4_0 for kv-cache * AVX2 Flash Attention: add ability to use Q4_1 for kv-cache * Fix Zen4 --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>