Files
ik_llama.cpp/ggml
Iwan Kawrakow 686e75650e Skip barriers of noops
GGML_OP_RESHAPE, GGML_OP_VIEW, GGML_OP_PERMUTE, GGML_OP_TRANSPOSE,
along with GGML_OP_NONE, are all noops. I.e., nothinh happens.
But ggml still has a barrier after them, which wastes time.
The waste is not too bad for large models where computations are
long compared to the time taken for thread synchronization.
But for small models skipping those unnecessary waits makes
a significant difference. E.g., for the 99M TriLMamodel,
TG-500 goes up to 1426 t/s from 1240 t/s.
2024-08-14 09:49:12 +03:00
..
2024-07-27 07:55:01 +02:00
2024-08-14 09:49:12 +03:00
2024-08-14 09:49:12 +03:00
2024-07-27 07:55:01 +02:00