ik_llama.cpp/ggml.c at a4bbd36905b9ac2c2a5f1ded6d1e29fbd5cf4020

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-28 02:11:50 +00:00

Files

Iwan Kawrakow 8c936e3d65 bitnet: replace ggml_mul with ggml_scale to apply the scales

Also save one scale operation in the ffn network by adjusting
rms_eps. We gain up to 3% in performance by doing this, but it
is a bit of a hack (we store the tensor scales in op_params
while loading the model).

2024-06-22 12:02:52 +03:00

729 KiB

Raw Blame History

View Raw

729 KiB Raw Blame History

729 KiB

Raw Blame History