mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-28 17:14:17 +00:00
Llama-quantize: Partial requant feature (#1313)
* Partial Requant feature for llama-quantize - Inspired by the recently portcopied --dry-run feature. - Allows to partially requantize a split quantized .gguf by requantizing only the missing splits in the destination directory. - Works both for GGUF which are split tensors by tensors, or by group of several tensors (though this one is not very much tested beyond 2 tensors by split). - Vibe coded. * Create output directory if it doesn't exist in llama-quantize * Create output directory if it doesn't exist in gguf-split * Add exit when directory fails to be created on Windows * Use std::filesystem * cleanup
This commit is contained in:
@@ -4414,6 +4414,7 @@ struct llama_model_quantize_params llama_model_quantize_default_params() {
|
||||
/*.ignore_imatrix_rules =*/ false,
|
||||
/*.only_repack =*/ false,
|
||||
/*.dry_run =*/ false,
|
||||
/*.partial_requant =*/ false,
|
||||
/*.imatrix =*/ nullptr,
|
||||
/*.kv_overrides =*/ nullptr,
|
||||
/*.custom_quants =*/ nullptr,
|
||||
|
||||
Reference in New Issue
Block a user