ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-20 22:49:31 +00:00

Files

Iwan Kawrakow aae817e50b DeepSeek TG optimizations for TG

* Fuse concat and copy into K cache
* Avoid ggml_cont() when n_token = 1

Combined effect: about +2% in TG performance with full GPU offload

2025-11-09 07:54:05 +02:00

2024-07-27 07:55:01 +02:00

2025-11-07 07:11:23 +02:00

2025-11-09 07:54:05 +02:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2025-11-05 10:58:12 +02:00