From f80505911db320f9d55fbe5426b8aabb1c6138a0 Mon Sep 17 00:00:00 2001 From: mcm007 Date: Sat, 14 Feb 2026 10:01:52 +0200 Subject: [PATCH] Improve README.md (#1260) --- docker/README.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docker/README.md b/docker/README.md index 6589d128..7de20f9b 100644 --- a/docker/README.md +++ b/docker/README.md @@ -123,7 +123,7 @@ docker run -it --name ik_llama_full --rm -v /my_local_files/gguf:/models:ro --r - If you build the image on the same machine where will be used, change `-DGGML_NATIVE=OFF` to `-DGGML_NATIVE=ON` in the `.Containerfile`. - For a smaller CUDA build, identify your GPU [CUDA GPU Compute Capability](https://developer.nvidia.com/cuda/gpus) (e.g. `8.6` for RTX30*0) then change `CUDA_DOCKER_ARCH` in `ik_llama-cuda.Containerfile` from `default` to your GPU architecture (e.g. `CUDA_DOCKER_ARCH=86`). - If you build only for your GPU architecture and want to make use of more KV quantization types, build with `-DGGML_IQK_FA_ALL_QUANTS=ON`. -- Get the best (measures kindly provided on each model card) quants from [ubergarm](https://huggingface.co/ubergarm/models) if available. +- Look for premade quants (and imatrix files) that work well on most standard systems and are designed around ik_llama.cpp (with helpful metrics in the model card) from [ubergarm](https://huggingface.co/ubergarm/models). - Usefull graphs and numbers on @magikRUKKOLA [Perplexity vs Size Graphs for the recent quants (GLM-4.7, Kimi-K2-Thinking, Deepseek-V3.1-Terminus, Deepseek-R1, Qwen3-Coder, Kimi-K2, Chimera etc.)](https://github.com/ikawrakow/ik_llama.cpp/discussions/715) topic. - Build custom quants with [Thireus](https://github.com/Thireus/GGUF-Tool-Suite)'s tools. - Download from [ik_llama.cpp's Thireus fork with release builds for macOS/Windows/Ubuntu CPU and Windows CUDA](https://github.com/Thireus/ik_llama.cpp) if you cannot build. @@ -133,6 +133,4 @@ docker run -it --name ik_llama_full --rm -v /my_local_files/gguf:/models:ro --r All credits to the awesome community: -[ikawrakow](https://github.com/ikawrakow/ik_llama.cpp) - [llama-swap](https://github.com/mostlygeek/llama-swap)