Files
ik_llama.cpp/ggml
Iwan Kawrakow d50f4f9439 Simdified gelu
Gives ~1% speedup for Gemma2-9b prompt processing on AVX512/AVX2.
It looks like the gelu operation is memory bound on my CPU's
after SIMD-ifying it. By not using the 128 kb gelu lookup table
we gain a small advantage.
On the M2-Max the lookup table is slightly faster than the SIMD
version, so left the lookup table for ARM_NEON.
2024-08-19 17:40:01 +03:00
..
2024-07-27 07:55:01 +02:00
2024-08-19 17:40:01 +03:00
2024-08-19 17:40:01 +03:00
2024-07-27 07:55:01 +02:00