mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-25 23:54:10 +00:00
* Zen4 flash attention: moving useful parts from the kq_fused_softmax branch * Add flash attention with soft-cap and fix D = 256 case * Flash attention refinements * Update FlashAttn comment --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>