mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-26 08:04:09 +00:00
* Partial Requant feature for llama-quantize - Inspired by the recently portcopied --dry-run feature. - Allows to partially requantize a split quantized .gguf by requantizing only the missing splits in the destination directory. - Works both for GGUF which are split tensors by tensors, or by group of several tensors (though this one is not very much tested beyond 2 tensors by split). - Vibe coded. * Create output directory if it doesn't exist in llama-quantize * Create output directory if it doesn't exist in gguf-split * Add exit when directory fails to be created on Windows * Use std::filesystem * cleanup
GGUF split Example
CLI to split / merge GGUF files.
Command line options:
--split: split GGUF to multiple GGUF, default operation.--split-max-size: max size per split inMorG, f.ex.500Mor2G.--split-max-tensors: maximum tensors in each split: default(128)--merge: merge multiple GGUF to a single GGUF.