From cbf7fc7e2f7de4400dd848ff2c221a6c8ea0384f Mon Sep 17 00:00:00 2001 From: Kawrakow Date: Sun, 22 Feb 2026 07:42:17 +0100 Subject: [PATCH] Update README with warning about '_XL' models from Unsloth Added important note regarding quantized models from Unsloth. --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index c76cc39c..ea98f291 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,9 @@ This repository is a fork of [llama.cpp](https://github.com/ggerganov/llama.cpp) with better CPU and hybrid GPU/CPU performance, new SOTA quantization types, first-class Bitnet support, better DeepSeek performance via MLA, FlashMLA, fused MoE operations and tensor overrides for hybrid GPU/CPU inference, row-interleaved quant packing, etc. +>[!IMPORTANT] +>Do not use quantized models from Unsloth that have `_XL` in their name. These are likely to not work with `ik_llama.cpp`. + ## Quickstart ### Prerequisites