Files
Iwan Kawrakow aae817e50b DeepSeek TG optimizations for TG
* Fuse concat and copy into K cache
* Avoid ggml_cont() when n_token = 1

Combined effect: about +2% in TG performance with full GPU offload
2025-11-09 07:54:05 +02:00
..
2024-07-27 07:55:01 +02:00
2025-11-09 07:54:05 +02:00
2024-07-27 07:55:01 +02:00