Check for NaNs while loading the model. (#727)

* Check for NaNs while loading the model. * Also tell which experts have NaNs. * Add command line option to validate quants * Add checks for more quantization types * Add checks for more quantizagtion types --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2026-03-13 15:30:03 +00:00 · 2025-08-27 19:00:17 +03:00
parent ca5b6ab9b1
commit e760b4dc41
6 changed files with 199 additions and 2 deletions
--- a/include/llama.h
+++ b/include/llama.h
@@ -377,7 +377,8 @@ extern "C" {
        bool use_mlock;     // force system to keep model in RAM
        bool check_tensors; // validate model tensor data
        bool repack_tensors;// repack if available
-        bool use_thp;       // uase transparent huge pages (linux only)
+        bool use_thp;       // use transparent huge pages (linux only)
+        bool validate_quants; // if true, check for NaNs while loading the model
    };

    // NOTE: changing the default values of parameters marked as [EXPERIMENTAL] may cause crashes or incorrect results in certain configurations