mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-26 08:04:09 +00:00
OK, if we take into account that the mask is diagonal and skip further computations once we encounter -INFINITY, we can speed it up and make it on par with no-FA. Better than nothing, but still no luck.