mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-10 14:00:08 +00:00
* Graph parallel for Qwen3.5-MoE * Add --max-gpu to llama-bench * Fix graph reuse when not all GPUs participate in self-attention
80 KiB
80 KiB