mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-08 04:50:13 +00:00
* Zen4 flash attention: moving useful parts from the kq_fused_softmax branch * Add flash attention with soft-cap and fix D = 256 case * Flash attention refinements * Update FlashAttn comment --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>