Files
ik_llama.cpp/ggml
Iwan Kawrakow 001abccf73 Fusing MoE up * unary(gate): CUDA
We get ~13% speedup for PP-512 and ~2% for TG-128
for DeepSeek-Lite
2025-02-23 11:37:31 +02:00
..
2024-07-27 07:55:01 +02:00
2025-02-23 11:37:31 +02:00
2025-02-23 11:37:31 +02:00
2024-07-27 07:55:01 +02:00