Fused mul + multi_add op (#858)

* Adding fused mul+multi_add + CPU implementation

* fused mul+multi_add: CUDA

* fused mul+multi_add: command line argument to disable it

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
Kawrakow
2025-10-24 07:40:35 +03:00
committed by GitHub
parent 483cea527d
commit db3ba4999f
15 changed files with 211 additions and 38 deletions

View File

@@ -33,6 +33,7 @@ struct llama_cparams {
bool fused_moe_up_gate;
bool grouped_expert_routing;
bool fused_up_gate;
bool fused_mmad;
int min_experts;
float thresh_experts;