ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-28 02:11:50 +00:00

Files

Kawrakow 9c1c74acda Step-3.5-Flash support (#1231 )

* WIP

* This works but is slow

* Turn off the up / gate clamps for now

* OK we need the clamping

* Fuse the clamp (CUDA)

* Fuse the clamp (CPU)

* WIP

* Be able to use merged q, k, v

* Be able to use merged up/gate experts

* Fuse the clamp (CUDA mmvq)

2026-02-05 08:13:22 +02:00

cmake

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

include

Remove llamafile remnants (#1179 )

2026-01-22 13:20:23 +02:00

src

Step-3.5-Flash support (#1231 )

2026-02-05 08:13:22 +02:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

Remove llamafile remnants (#1179 )

2026-01-22 13:20:23 +02:00