Files
ik_llama.cpp/ggml/src
Kawrakow ed4e1a6588 Fuse add+add+fused_rms (#853)
* Fuse add+add+fused_rms

* Try this

* Macro to easily enable/disable fusion

* Various:

* Check that all tensors involved are on the same device before applying fusion
* Fuse sigmoid+scale+sum_rows+div
* Fix the fused bailingmoe2 experts selection

The issue there was that the bias was not per row, but per
expert group, so only the first n_per_group biases were used
for al experts.

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-10-22 16:18:11 +03:00
..
2025-10-22 16:18:11 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-10-16 11:31:03 +03:00
2024-07-27 07:55:01 +02:00
2025-10-22 16:18:11 +03:00
2025-08-09 08:40:18 +03:00
2025-08-09 08:40:18 +03:00
2025-08-09 08:40:18 +03:00
2025-08-27 08:03:47 +03:00
2025-07-15 08:03:13 +02:00