ik_llama.cpp/ggml.c at 4742bda9a2200433b419a43e8a2f2c1cc2e310be

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-27 18:01:45 +00:00

Files

snadampal 1ca08650a3 ggml : update softmax n_task calculation (#5126 )

updated the n_task calculation to use max number of
threads possible. This has improved the prompt eval
performance by around 5% for DOT kernels and by
around 10% for MMLA kernels on AWS Graviton3.

2024-01-26 19:17:59 +02:00

654 KiB

Raw Blame History

View Raw

654 KiB Raw Blame History

654 KiB

Raw Blame History