Files
ik_llama.cpp/ggml
Kawrakow 9c1c74acda Step-3.5-Flash support (#1231)
* WIP

* This works but is slow

* Turn off the up / gate clamps for now

* OK we need the clamping

* Fuse the clamp (CUDA)

* Fuse the clamp (CPU)

* WIP

* Be able to use merged q, k, v

* Be able to use merged up/gate experts

* Fuse the clamp (CUDA mmvq)
2026-02-05 08:13:22 +02:00
..
2024-07-27 07:55:01 +02:00
2026-01-22 13:20:23 +02:00
2026-02-05 08:13:22 +02:00
2024-07-27 07:55:01 +02:00