Bitnet: use the fused mul-silu in the FFN network

I had forgotten that build_bitnet() does not use the standerd
llm_build_ffn function, so the fused mul-silu didn't get used
for Bitnet when I added it to llm_build_ffn.

This gives us another ~1% speedup for TG-128.
This commit is contained in:
Iwan Kawrakow
2024-10-26 18:33:42 +03:00
parent bd309cb782
commit ee9b052414

View File

@@ -13399,12 +13399,7 @@ struct llm_build_context {
cb(cur, "ffn_gate", il);
// combine this with the above scale into ggml_scaled_silu
cur = ggml_silu(ctx0, cur);
cb(cur, "ffn_silu", il);
cur = ggml_mul(ctx0, cur, tmp);
cur = ggml_fused_mul_unary(ctx0, cur, tmp, GGML_UNARY_OP_SILU);
cb(cur, "ffn_gate_par", il);
cur = llm_build_norm(ctx0, cur, hparams,