Check for NaNs while loading the model. (#727)

* Check for NaNs while loading the model.

* Also tell which experts have NaNs.

* Add command line option to validate quants

* Add checks for more quantization types

* Add checks for more quantizagtion types

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
Kawrakow
2025-08-27 19:00:17 +03:00
committed by GitHub
parent ca5b6ab9b1
commit e760b4dc41
6 changed files with 199 additions and 2 deletions

View File

@@ -209,6 +209,7 @@ struct gpt_params {
bool check_tensors = false; // validate tensor data
bool repack_tensors = false; // repack tensors if interleaved variant is available
bool use_thp = false; // use transparent huge pages (linux only)
bool validate_quants = false; // if true, check for NaNs while loading the model
std::string cache_type_k = "f16"; // KV cache data type for the K
std::string cache_type_v = "f16"; // KV cache data type for the V