mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-24 07:04:11 +00:00
* Fuse add+add+fused_rms * Try this * Macro to easily enable/disable fusion * Various: * Check that all tensors involved are on the same device before applying fusion * Fuse sigmoid+scale+sum_rows+div * Fix the fused bailingmoe2 experts selection The issue there was that the bias was not per row, but per expert group, so only the first n_per_group biases were used for al experts. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>