mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-02 10:00:07 +00:00
319 B
319 B
🔀 #50 - AVX2 Flash Attention 2
| Author | ikawrakow |
|---|---|
| State | ❌ Closed |
| Created | 2024-09-11 |
| Updated | 2024-09-11 |
Description
This PR adds the ability to use Q4_0, Q4_1 and Q8_0 for the kv-cache.