mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-08 04:50:13 +00:00
* Graph parallel for Qwen3.5-MoE * Add --max-gpu to llama-bench * Fix graph reuse when not all GPUs participate in self-attention
80 KiB
80 KiB