mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-02 04:29:53 +00:00
Bitnet: use the fused mul-silu in the FFN network (#110)
I had forgotten that build_bitnet() does not use the standerd llm_build_ffn function, so the fused mul-silu didn't get used for Bitnet when I added it to llm_build_ffn. This gives us another ~1% speedup for TG-128. Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
@@ -13399,12 +13399,7 @@ struct llm_build_context {
|
||||
|
||||
cb(cur, "ffn_gate", il);
|
||||
|
||||
|
||||
// combine this with the above scale into ggml_scaled_silu
|
||||
cur = ggml_silu(ctx0, cur);
|
||||
cb(cur, "ffn_silu", il);
|
||||
|
||||
cur = ggml_mul(ctx0, cur, tmp);
|
||||
cur = ggml_fused_mul_unary(ctx0, cur, tmp, GGML_UNARY_OP_SILU);
|
||||
cb(cur, "ffn_gate_par", il);
|
||||
|
||||
cur = llm_build_norm(ctx0, cur, hparams,
|
||||
|
||||
Reference in New Issue
Block a user