Update README with model compatibility warnings

Add warnings about incompatible models with merged ffn_up_exps and ffn_gate_exps tensors.
This commit is contained in:
Kawrakow
2026-03-11 12:06:45 +01:00
committed by GitHub
parent bb45cc3c74
commit fd4638f0e8

View File

@@ -13,7 +13,10 @@ This repository is a fork of [llama.cpp](https://github.com/ggerganov/llama.cpp)
>[!IMPORTANT]
>Do not use quantized models from Unsloth that have `_XL` in their name. These are likely to not work with `ik_llama.cpp`.
>
>The above has caused some stir, so to clarify: the Unsloth `_XL` models that are likely to not work are those that contain `f16` tensors (which is never a good idea in the first place). All others are fine.
>The above has caused some stir, so to clarify: the Unsloth `_XL` models that are likely to not work are those that contain `f16` tensors (which is never a good idea in the first place). All others are fine.
>[!IMPORTANT]
>Do not download models where the `ffn_up_exps` and `ffn_gate_exps` tensors have been merged into `ffn_gate_up_exps`. These models will not work with `ik_llama.cpp`. The merge can be done on-the-fly when loading the model, see [PR #1137](https://github.com/ikawrakow/ik_llama.cpp/pull/1137). Hence, there is absolutely zero reason to tolerate incompatibilities added by `llama.cpp` maintainers. If that does not work for you, just use `llama.cpp`.
## Quickstart