Files
ik_llama.cpp/ggml
Kawrakow cafeef484c More Qwen3-Next optimizations (#1277)
* Optimizing q3next TG

* Fused add -> softplus -> mul on CUDA

* Remove forgotten debug log

* Increase ggml context size

Required for Qwen3-Next with batch/u-batch size of 4096

* WIP

* Avoid some contiguous ops

* Avoid some repeats

* Avoid some more repeats
2026-02-17 16:03:51 +01:00
..
2024-07-27 07:55:01 +02:00
2026-02-16 06:50:28 +01:00
2026-02-17 16:03:51 +01:00
2024-07-27 07:55:01 +02:00