Files
ik_llama.cpp/ggml
Kawrakow 7874e4425f AVX2 Flash Attention 2 (#50)
* AVX2 Flash Attention: add ability to use Q8_0 for kv-cache

* AVX2 Flash Attention: add ability to use Q4_0 for kv-cache

* AVX2 Flash Attention: add ability to use Q4_1 for kv-cache

* Fix Zen4

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2024-09-11 19:55:42 +03:00
..
2024-07-27 07:55:01 +02:00
2024-09-11 19:55:42 +03:00
2024-07-27 07:55:01 +02:00