Files
ik_llama.cpp/github-data/pull_requests/272 - Convert models to row-interleaved quants using the quantize tool.md
2025-07-23 13:31:53 +02:00

140 KiB

🔀 #272 - Convert models to row-interleaved quants using the quantize tool

Author ikawrakow
State Closed
Created 2025-03-20
Updated 2025-03-21

Description

The main purpose of this PR is to remove the need for run-time-repacking (command line argument -rtr) by having a tool to convert models to row-interleaved quantization types. The main motivation for providing this tool is to allow using mmap when loading a model and still having row-interleaved quants, so that one can combine the claimed performance gains from using 1 GiB huge pages (see #267) with the performance gains due to row-interleaved quants.

Note: this is only useful for CPU-only inference. The converted (repacked) model will not work on a GPU (or rather it will work but will be slow as all matrix multiplications with the repacked tensors will be done on the CPU).

To use it, simply

./bin/llama-quantize --repack some_model repacked_model some_quant

The some_quant argument is not actually used, but I didn't want to make modifications to the llama-quantize command line argument parsing, so the argument must be provided, but it is ignored.

Oh, bf16 and f16 models can be repacked too, one gets a GGML_TYPE_BF16_R16 model as a result. On CPU's with native bf16 support, GGML_TYPE_BF16_R16 is about 15% faster than GGML_TYPE_BF16, and nearly 2X faster than GGML_TYPE_F16 (for prompt processing, TG is memory bound, so not much difference there).

Caveat: Some of the quantization types had a relatively minor, platform-specific, optimization applied when run-time-repacking. But as there is no way to tell if the repacking was done online, or if we are dealing with an offline-repacked model, I had to remove this optimization. This affects Q8_0_R8, Q8_K_R8, Q8_KV_R8 on Zen4 (127 was added to these quants during run-time-repacking to avoid doing this during inference), and Q4_0_R8 on ARM (a mask of 0x88 was applied to the packed bits, which converts the otherwise unsigned Q4_0 values to signed values multiplied with 16).

Closes #228


💬 Conversation

👤 ikawrakow commented the 2025-03-20 at 14:53:05:

Does the last commit fix it? Strange that we can no longer compare std::string to a C-string, and a reference to std::string is no longer automatically instantiated from a C-string. Seriously? This will brake billions of LoC of C++.


👤 ubergarm commented the 2025-03-20 at 14:55:53:

Seems to be compiling now on d27b7226. I'll go back and check if simply adding #include string to ./ggml/src/iqk/iqk_quantize.cpp would also fix it to confirm.


👤 ubergarm commented the 2025-03-20 at 14:58:43:

Yeah, just needs the include e.g.

$ git rev-parse --short HEAD
9fbe5bee
$ git diff
diff --git a/ggml/src/iqk/iqk_quantize.cpp b/ggml/src/iqk/iqk_quantize.cpp
index bc6f34eb..0375b878 100644
--- a/ggml/src/iqk/iqk_quantize.cpp
+++ b/ggml/src/iqk/iqk_quantize.cpp
@@ -21,6 +21,7 @@
 #include <array>
 #include <algorithm>
 #include <cstring>
+#include <string>
 #include <mutex>
 #include <thread>
 #include <atomic>

## builds good

👤 ikawrakow commented the 2025-03-20 at 15:36:25:

I think we can leave the two unnecessary changes. If we remove the explicit string construction, the compiler does it for us anyway.


👤 ubergarm commented the 2025-03-20 at 15:38:00:

Okay, repacking seems to be working. I'll try out the freshly generated repacked weights next.

Detailed Command Output Logs
$ git rev-parse --short HEAD
9fe6fc37

$ ./build/bin/llama-quantize \
    --repack /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf \
    /mnt/ai/models/unsloth/repack/DeepSeek-R1-Q4_K_R4.gguf \
    Q4_K_R4 # <--- *NOTE*: this is unused, but must be any valid option

main: invalid ftype '/mnt/ai/models/unsloth/repack/DeepSeek-R1-Q4_K_R4.gguf'
main: build = 3604 (9fe6fc37)
main: built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
main: quantizing '/mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf' to '/mnt/ai/models/unsloth/repack/DeepSeek-R1-Q4_K_R4.gguf' as Q4_K_R4
llama_model_loader: additional 8 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 48 key-value pairs and 1025 tensors from /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = deepseek2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 BF16
llama_model_loader: - kv   3:                       general.quantized_by str              = Unsloth
llama_model_loader: - kv   4:                         general.size_label str              = 256x20B
llama_model_loader: - kv   5:                           general.repo_url str              = https://huggingface.co/unsloth
llama_model_loader: - kv   6:                      deepseek2.block_count u32              = 61
llama_model_loader: - kv   7:                   deepseek2.context_length u32              = 163840
llama_model_loader: - kv   8:                 deepseek2.embedding_length u32              = 7168
llama_model_loader: - kv   9:              deepseek2.feed_forward_length u32              = 18432
llama_model_loader: - kv  10:             deepseek2.attention.head_count u32              = 128
llama_model_loader: - kv  11:          deepseek2.attention.head_count_kv u32              = 128
llama_model_loader: - kv  12:                   deepseek2.rope.freq_base f32              = 10000.000000
llama_model_loader: - kv  13: deepseek2.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  14:                deepseek2.expert_used_count u32              = 8
llama_model_loader: - kv  15:        deepseek2.leading_dense_block_count u32              = 3
llama_model_loader: - kv  16:                       deepseek2.vocab_size u32              = 129280
llama_model_loader: - kv  17:            deepseek2.attention.q_lora_rank u32              = 1536
llama_model_loader: - kv  18:           deepseek2.attention.kv_lora_rank u32              = 512
llama_model_loader: - kv  19:             deepseek2.attention.key_length u32              = 192
llama_model_loader: - kv  20:           deepseek2.attention.value_length u32              = 128
llama_model_loader: - kv  21:       deepseek2.expert_feed_forward_length u32              = 2048
llama_model_loader: - kv  22:                     deepseek2.expert_count u32              = 256
llama_model_loader: - kv  23:              deepseek2.expert_shared_count u32              = 1
llama_model_loader: - kv  24:             deepseek2.expert_weights_scale f32              = 2.500000
llama_model_loader: - kv  25:              deepseek2.expert_weights_norm bool             = true
llama_model_loader: - kv  26:               deepseek2.expert_gating_func u32              = 2
llama_model_loader: - kv  27:             deepseek2.rope.dimension_count u32              = 64
llama_model_loader: - kv  28:                deepseek2.rope.scaling.type str              = yarn
llama_model_loader: - kv  29:              deepseek2.rope.scaling.factor f32              = 40.000000
llama_model_loader: - kv  30: deepseek2.rope.scaling.original_context_length u32              = 4096
llama_model_loader: - kv  31: deepseek2.rope.scaling.yarn_log_multiplier f32              = 0.100000
llama_model_loader: - kv  32:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  33:                         tokenizer.ggml.pre str              = deepseek-v3
.
.
.
[   1/1025]                        output.weight - [ 7168, 129280,     1,     1], type =   q6_K, size =  724.951 MB, type = q6_k_r4
[   2/1025]                   output_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[   3/1025]                    token_embd.weight - [ 7168, 129280,     1,     1], type =   q4_K, size =  497.109 MB, type = q4_K
[   4/1025]           blk.0.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[   5/1025]          blk.0.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[   6/1025]               blk.0.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[   7/1025]               blk.0.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[   8/1025]             blk.0.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[   9/1025]                blk.0.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  10/1025]           blk.0.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  11/1025]                blk.0.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  12/1025]                blk.0.ffn_down.weight - [18432,  7168,     1,     1], type =   q6_K, size =  103.359 MB, type = q6_k_r4
[  13/1025]                blk.0.ffn_gate.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  14/1025]                blk.0.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  15/1025]                  blk.0.ffn_up.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  16/1025]           blk.1.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  17/1025]          blk.1.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  18/1025]               blk.1.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  19/1025]               blk.1.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  20/1025]             blk.1.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  21/1025]                blk.1.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  22/1025]           blk.1.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  23/1025]                blk.1.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  24/1025]                blk.1.ffn_down.weight - [18432,  7168,     1,     1], type =   q6_K, size =  103.359 MB, type = q6_k_r4
[  25/1025]                blk.1.ffn_gate.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  26/1025]                blk.1.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  27/1025]                  blk.1.ffn_up.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  28/1025]           blk.2.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  29/1025]          blk.2.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  30/1025]               blk.2.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  31/1025]               blk.2.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  32/1025]             blk.2.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  33/1025]                blk.2.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  34/1025]           blk.2.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  35/1025]                blk.2.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  36/1025]                blk.2.ffn_down.weight - [18432,  7168,     1,     1], type =   q6_K, size =  103.359 MB, type = q6_k_r4
[  37/1025]                blk.2.ffn_gate.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  38/1025]                blk.2.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  39/1025]                  blk.2.ffn_up.weight - [ 7168, 18432,     1,     1], type =   q4_K, size =   70.875 MB, type = q4_k_r4
[  40/1025]           blk.3.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  41/1025]          blk.3.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  42/1025]               blk.3.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  43/1025]               blk.3.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  44/1025]             blk.3.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  45/1025]                blk.3.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  46/1025]           blk.3.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  47/1025]                blk.3.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  48/1025]               blk.3.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[  49/1025]           blk.3.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[  50/1025]          blk.3.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[  51/1025]           blk.3.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  52/1025]            blk.3.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[  53/1025]          blk.3.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  54/1025]                blk.3.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  55/1025]             blk.3.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  56/1025]            blk.3.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  57/1025]           blk.4.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  58/1025]          blk.4.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  59/1025]               blk.4.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  60/1025]               blk.4.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  61/1025]             blk.4.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  62/1025]                blk.4.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  63/1025]           blk.4.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  64/1025]                blk.4.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  65/1025]               blk.4.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[  66/1025]           blk.4.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[  67/1025]          blk.4.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[  68/1025]           blk.4.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  69/1025]            blk.4.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[  70/1025]          blk.4.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  71/1025]                blk.4.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  72/1025]             blk.4.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  73/1025]            blk.4.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  74/1025]           blk.5.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  75/1025]          blk.5.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  76/1025]               blk.5.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  77/1025]               blk.5.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  78/1025]             blk.5.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  79/1025]                blk.5.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  80/1025]           blk.5.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  81/1025]                blk.5.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  82/1025]               blk.5.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[  83/1025]           blk.5.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[  84/1025]          blk.5.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[  85/1025]           blk.5.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  86/1025]            blk.5.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[  87/1025]          blk.5.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  88/1025]                blk.5.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  89/1025]             blk.5.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[  90/1025]            blk.5.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[  91/1025]           blk.6.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[  92/1025]          blk.6.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[  93/1025]               blk.6.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[  94/1025]               blk.6.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[  95/1025]             blk.6.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[  96/1025]                blk.6.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[  97/1025]           blk.6.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[  98/1025]                blk.6.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[  99/1025]               blk.6.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 100/1025]           blk.6.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 101/1025]          blk.6.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 102/1025]           blk.6.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 103/1025]            blk.6.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 104/1025]          blk.6.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 105/1025]                blk.6.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 106/1025]             blk.6.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 107/1025]            blk.6.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 108/1025]           blk.7.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 109/1025]          blk.7.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 110/1025]               blk.7.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 111/1025]               blk.7.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 112/1025]             blk.7.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 113/1025]                blk.7.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 114/1025]           blk.7.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 115/1025]                blk.7.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 116/1025]               blk.7.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 117/1025]           blk.7.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 118/1025]          blk.7.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 119/1025]           blk.7.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 120/1025]            blk.7.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 121/1025]          blk.7.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 122/1025]                blk.7.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 123/1025]             blk.7.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 124/1025]            blk.7.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 125/1025]           blk.8.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 126/1025]          blk.8.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 127/1025]               blk.8.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 128/1025]               blk.8.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 129/1025]             blk.8.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 130/1025]                blk.8.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 131/1025]           blk.8.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 132/1025]                blk.8.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 133/1025]               blk.8.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 134/1025]           blk.8.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 135/1025]          blk.8.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 136/1025]           blk.8.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 137/1025]            blk.8.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 138/1025]          blk.8.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 139/1025]                blk.8.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 140/1025]             blk.8.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 141/1025]            blk.8.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 142/1025]           blk.9.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 143/1025]          blk.9.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 144/1025]               blk.9.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 145/1025]               blk.9.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 146/1025]             blk.9.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 147/1025]                blk.9.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 148/1025]           blk.9.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 149/1025]                blk.9.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 150/1025]               blk.9.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 151/1025]           blk.9.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 152/1025]          blk.9.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 153/1025]           blk.9.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 154/1025]            blk.9.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 155/1025]          blk.9.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 156/1025]                blk.9.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 157/1025]             blk.9.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 158/1025]            blk.9.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 159/1025]          blk.10.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 160/1025]         blk.10.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 161/1025]              blk.10.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 162/1025]              blk.10.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 163/1025]            blk.10.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 164/1025]               blk.10.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 165/1025]          blk.10.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 166/1025]               blk.10.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 167/1025]              blk.10.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 168/1025]          blk.10.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 169/1025]         blk.10.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 170/1025]          blk.10.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 171/1025]           blk.10.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 172/1025]         blk.10.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 173/1025]               blk.10.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 174/1025]            blk.10.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 175/1025]           blk.10.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 176/1025]          blk.11.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 177/1025]         blk.11.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 178/1025]              blk.11.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 179/1025]              blk.11.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 180/1025]            blk.11.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 181/1025]               blk.11.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 182/1025]          blk.11.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 183/1025]               blk.11.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 184/1025]              blk.11.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 185/1025]          blk.11.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 186/1025]         blk.11.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 187/1025]          blk.11.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 188/1025]           blk.11.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 189/1025]         blk.11.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 190/1025]               blk.11.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 191/1025]            blk.11.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 192/1025]           blk.11.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 193/1025]          blk.12.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 194/1025]         blk.12.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 195/1025]              blk.12.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 196/1025]              blk.12.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 197/1025]            blk.12.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 198/1025]               blk.12.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 199/1025]          blk.12.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 200/1025]               blk.12.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 201/1025]              blk.12.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 202/1025]          blk.12.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 203/1025]         blk.12.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 204/1025]          blk.12.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 205/1025]           blk.12.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 206/1025]         blk.12.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 207/1025]               blk.12.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 208/1025]            blk.12.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 209/1025]           blk.12.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 210/1025]          blk.13.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 211/1025]         blk.13.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 212/1025]              blk.13.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 213/1025]              blk.13.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 214/1025]            blk.13.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 215/1025]               blk.13.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 216/1025]          blk.13.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 217/1025]               blk.13.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 218/1025]              blk.13.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 219/1025]          blk.13.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 220/1025]         blk.13.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 221/1025]          blk.13.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 222/1025]           blk.13.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 223/1025]         blk.13.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 224/1025]               blk.13.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 225/1025]            blk.13.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 226/1025]           blk.13.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 227/1025]          blk.14.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 228/1025]         blk.14.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 229/1025]              blk.14.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 230/1025]              blk.14.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 231/1025]            blk.14.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 232/1025]               blk.14.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 233/1025]          blk.14.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 234/1025]               blk.14.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 235/1025]              blk.14.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 236/1025]          blk.14.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 237/1025]         blk.14.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 238/1025]          blk.14.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 239/1025]           blk.14.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 240/1025]         blk.14.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 241/1025]               blk.14.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 242/1025]            blk.14.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 243/1025]           blk.14.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 244/1025]          blk.15.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 245/1025]         blk.15.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 246/1025]              blk.15.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 247/1025]              blk.15.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 248/1025]            blk.15.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 249/1025]               blk.15.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 250/1025]          blk.15.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 251/1025]               blk.15.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 252/1025]              blk.15.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 253/1025]          blk.15.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 254/1025]         blk.15.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 255/1025]          blk.15.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 256/1025]           blk.15.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 257/1025]         blk.15.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 258/1025]               blk.15.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 259/1025]            blk.15.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 260/1025]           blk.15.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 261/1025]          blk.16.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 262/1025]         blk.16.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 263/1025]              blk.16.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 264/1025]              blk.16.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 265/1025]            blk.16.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 266/1025]               blk.16.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 267/1025]          blk.16.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 268/1025]               blk.16.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 269/1025]              blk.16.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 270/1025]          blk.16.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 271/1025]         blk.16.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 272/1025]          blk.16.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 273/1025]           blk.16.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 274/1025]         blk.16.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 275/1025]               blk.16.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 276/1025]            blk.16.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 277/1025]           blk.16.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 278/1025]          blk.17.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 279/1025]         blk.17.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 280/1025]              blk.17.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 281/1025]              blk.17.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 282/1025]            blk.17.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 283/1025]               blk.17.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 284/1025]          blk.17.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 285/1025]               blk.17.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 286/1025]              blk.17.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 287/1025]          blk.17.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 288/1025]         blk.17.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 289/1025]          blk.17.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 290/1025]           blk.17.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 291/1025]         blk.17.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 292/1025]               blk.17.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 293/1025]            blk.17.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 294/1025]           blk.17.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 295/1025]          blk.18.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 296/1025]         blk.18.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 297/1025]              blk.18.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 298/1025]              blk.18.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 299/1025]            blk.18.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 300/1025]               blk.18.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 301/1025]          blk.18.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 302/1025]               blk.18.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 303/1025]              blk.18.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 304/1025]          blk.18.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 305/1025]         blk.18.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 306/1025]          blk.18.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 307/1025]           blk.18.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 308/1025]         blk.18.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 309/1025]               blk.18.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 310/1025]            blk.18.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 311/1025]           blk.18.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 312/1025]          blk.19.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 313/1025]         blk.19.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 314/1025]              blk.19.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 315/1025]              blk.19.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 316/1025]            blk.19.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 317/1025]               blk.19.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 318/1025]          blk.19.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 319/1025]               blk.19.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 320/1025]              blk.19.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 321/1025]          blk.19.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 322/1025]         blk.19.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 323/1025]          blk.19.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 324/1025]           blk.19.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 325/1025]         blk.19.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 326/1025]               blk.19.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 327/1025]            blk.19.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 328/1025]           blk.19.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 329/1025]          blk.20.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 330/1025]         blk.20.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 331/1025]              blk.20.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 332/1025]              blk.20.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 333/1025]            blk.20.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 334/1025]               blk.20.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 335/1025]          blk.20.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 336/1025]               blk.20.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 337/1025]              blk.20.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 338/1025]          blk.20.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 339/1025]         blk.20.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 340/1025]          blk.20.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 341/1025]           blk.20.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 342/1025]         blk.20.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 343/1025]               blk.20.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 344/1025]            blk.20.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 345/1025]           blk.20.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 346/1025]          blk.21.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 347/1025]         blk.21.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 348/1025]              blk.21.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 349/1025]              blk.21.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 350/1025]            blk.21.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 351/1025]               blk.21.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 352/1025]          blk.21.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 353/1025]               blk.21.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 354/1025]              blk.21.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 355/1025]          blk.21.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 356/1025]         blk.21.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 357/1025]          blk.21.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 358/1025]           blk.21.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 359/1025]         blk.21.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 360/1025]               blk.21.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 361/1025]            blk.21.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 362/1025]           blk.21.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 363/1025]          blk.22.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 364/1025]         blk.22.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 365/1025]              blk.22.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 366/1025]              blk.22.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 367/1025]            blk.22.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 368/1025]               blk.22.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 369/1025]          blk.22.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 370/1025]               blk.22.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 371/1025]              blk.22.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 372/1025]          blk.22.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 373/1025]         blk.22.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 374/1025]          blk.22.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 375/1025]           blk.22.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 376/1025]         blk.22.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 377/1025]               blk.22.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 378/1025]            blk.22.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 379/1025]           blk.22.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 380/1025]          blk.23.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 381/1025]         blk.23.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 382/1025]              blk.23.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 383/1025]              blk.23.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 384/1025]            blk.23.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 385/1025]               blk.23.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 386/1025]          blk.23.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 387/1025]               blk.23.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 388/1025]              blk.23.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 389/1025]          blk.23.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 390/1025]         blk.23.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 391/1025]          blk.23.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 392/1025]           blk.23.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 393/1025]         blk.23.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 394/1025]               blk.23.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 395/1025]            blk.23.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 396/1025]           blk.23.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 397/1025]          blk.24.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 398/1025]         blk.24.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 399/1025]              blk.24.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 400/1025]              blk.24.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 401/1025]            blk.24.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 402/1025]               blk.24.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 403/1025]          blk.24.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 404/1025]               blk.24.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 405/1025]              blk.24.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 406/1025]          blk.24.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 407/1025]         blk.24.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 408/1025]          blk.24.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 409/1025]           blk.24.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 410/1025]         blk.24.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 411/1025]               blk.24.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 412/1025]            blk.24.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 413/1025]           blk.24.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 414/1025]          blk.25.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 415/1025]         blk.25.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 416/1025]              blk.25.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 417/1025]              blk.25.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 418/1025]            blk.25.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 419/1025]               blk.25.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 420/1025]          blk.25.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 421/1025]               blk.25.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 422/1025]              blk.25.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 423/1025]          blk.25.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 424/1025]         blk.25.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 425/1025]          blk.25.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 426/1025]           blk.25.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 427/1025]         blk.25.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 428/1025]               blk.25.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 429/1025]            blk.25.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 430/1025]           blk.25.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 431/1025]          blk.26.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 432/1025]         blk.26.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 433/1025]              blk.26.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 434/1025]              blk.26.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 435/1025]            blk.26.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 436/1025]               blk.26.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 437/1025]          blk.26.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 438/1025]               blk.26.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 439/1025]              blk.26.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 440/1025]          blk.26.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 441/1025]         blk.26.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 442/1025]          blk.26.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 443/1025]           blk.26.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 444/1025]         blk.26.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 445/1025]               blk.26.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 446/1025]            blk.26.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 447/1025]           blk.26.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 448/1025]          blk.27.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 449/1025]         blk.27.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 450/1025]              blk.27.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 451/1025]              blk.27.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 452/1025]            blk.27.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 453/1025]               blk.27.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 454/1025]          blk.27.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 455/1025]               blk.27.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 456/1025]              blk.27.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 457/1025]          blk.27.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 458/1025]         blk.27.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 459/1025]          blk.27.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 460/1025]           blk.27.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 461/1025]         blk.27.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 462/1025]               blk.27.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 463/1025]            blk.27.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 464/1025]           blk.27.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 465/1025]          blk.28.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 466/1025]         blk.28.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 467/1025]              blk.28.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 468/1025]              blk.28.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 469/1025]            blk.28.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 470/1025]               blk.28.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 471/1025]          blk.28.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 472/1025]               blk.28.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 473/1025]              blk.28.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 474/1025]          blk.28.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 475/1025]         blk.28.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 476/1025]          blk.28.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 477/1025]           blk.28.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 478/1025]         blk.28.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 479/1025]               blk.28.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 480/1025]            blk.28.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 481/1025]           blk.28.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 482/1025]          blk.29.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 483/1025]         blk.29.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 484/1025]              blk.29.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 485/1025]              blk.29.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 486/1025]            blk.29.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 487/1025]               blk.29.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 488/1025]          blk.29.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 489/1025]               blk.29.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 490/1025]              blk.29.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 491/1025]          blk.29.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 492/1025]         blk.29.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 493/1025]          blk.29.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 494/1025]           blk.29.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 495/1025]         blk.29.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 496/1025]               blk.29.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 497/1025]            blk.29.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 498/1025]           blk.29.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 499/1025]          blk.30.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 500/1025]         blk.30.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 501/1025]              blk.30.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 502/1025]              blk.30.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 503/1025]            blk.30.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 504/1025]               blk.30.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 505/1025]          blk.30.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 506/1025]               blk.30.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 507/1025]              blk.30.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 508/1025]          blk.30.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 509/1025]         blk.30.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 510/1025]          blk.30.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 511/1025]           blk.30.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 512/1025]         blk.30.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 513/1025]               blk.30.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 514/1025]            blk.30.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 515/1025]           blk.30.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 516/1025]          blk.31.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 517/1025]         blk.31.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 518/1025]              blk.31.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 519/1025]              blk.31.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 520/1025]            blk.31.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 521/1025]               blk.31.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 522/1025]          blk.31.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 523/1025]               blk.31.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 524/1025]              blk.31.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 525/1025]          blk.31.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 526/1025]         blk.31.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 527/1025]          blk.31.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 528/1025]           blk.31.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 529/1025]         blk.31.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 530/1025]               blk.31.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 531/1025]            blk.31.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 532/1025]           blk.31.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 533/1025]          blk.32.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 534/1025]         blk.32.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 535/1025]              blk.32.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 536/1025]              blk.32.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 537/1025]            blk.32.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 538/1025]               blk.32.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 539/1025]          blk.32.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 540/1025]               blk.32.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 541/1025]              blk.32.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 542/1025]          blk.32.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 543/1025]         blk.32.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 544/1025]          blk.32.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 545/1025]           blk.32.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 546/1025]         blk.32.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 547/1025]               blk.32.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 548/1025]            blk.32.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 549/1025]           blk.32.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 550/1025]          blk.33.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 551/1025]         blk.33.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 552/1025]              blk.33.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 553/1025]              blk.33.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 554/1025]            blk.33.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 555/1025]               blk.33.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 556/1025]          blk.33.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 557/1025]               blk.33.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 558/1025]              blk.33.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 559/1025]          blk.33.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 560/1025]         blk.33.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 561/1025]          blk.33.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 562/1025]           blk.33.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 563/1025]         blk.33.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 564/1025]               blk.33.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 565/1025]            blk.33.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 566/1025]           blk.33.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 567/1025]          blk.34.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 568/1025]         blk.34.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 569/1025]              blk.34.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 570/1025]              blk.34.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 571/1025]            blk.34.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 572/1025]               blk.34.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 573/1025]          blk.34.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 574/1025]               blk.34.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 575/1025]              blk.34.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 576/1025]          blk.34.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 577/1025]         blk.34.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 578/1025]          blk.34.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 579/1025]           blk.34.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 580/1025]         blk.34.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 581/1025]               blk.34.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 582/1025]            blk.34.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 583/1025]           blk.34.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 584/1025]          blk.35.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 585/1025]         blk.35.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 586/1025]              blk.35.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 587/1025]              blk.35.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 588/1025]            blk.35.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 589/1025]               blk.35.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 590/1025]          blk.35.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 591/1025]               blk.35.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 592/1025]              blk.35.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 593/1025]          blk.35.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 594/1025]         blk.35.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 595/1025]          blk.35.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 596/1025]           blk.35.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 597/1025]         blk.35.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 598/1025]               blk.35.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 599/1025]            blk.35.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 600/1025]           blk.35.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 601/1025]          blk.36.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 602/1025]         blk.36.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 603/1025]              blk.36.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 604/1025]              blk.36.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 605/1025]            blk.36.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 606/1025]               blk.36.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 607/1025]          blk.36.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 608/1025]               blk.36.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 609/1025]              blk.36.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 610/1025]          blk.36.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 611/1025]         blk.36.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 612/1025]          blk.36.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 613/1025]           blk.36.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 614/1025]         blk.36.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 615/1025]               blk.36.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 616/1025]            blk.36.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 617/1025]           blk.36.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 618/1025]          blk.37.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 619/1025]         blk.37.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 620/1025]              blk.37.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 621/1025]              blk.37.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 622/1025]            blk.37.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 623/1025]               blk.37.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 624/1025]          blk.37.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 625/1025]               blk.37.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 626/1025]              blk.37.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 627/1025]          blk.37.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 628/1025]         blk.37.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 629/1025]          blk.37.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 630/1025]           blk.37.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 631/1025]         blk.37.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 632/1025]               blk.37.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 633/1025]            blk.37.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 634/1025]           blk.37.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 635/1025]          blk.38.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 636/1025]         blk.38.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 637/1025]              blk.38.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 638/1025]              blk.38.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 639/1025]            blk.38.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 640/1025]               blk.38.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 641/1025]          blk.38.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 642/1025]               blk.38.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 643/1025]              blk.38.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 644/1025]          blk.38.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 645/1025]         blk.38.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 646/1025]          blk.38.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 647/1025]           blk.38.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 648/1025]         blk.38.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 649/1025]               blk.38.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 650/1025]            blk.38.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 651/1025]           blk.38.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 652/1025]          blk.39.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 653/1025]         blk.39.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 654/1025]              blk.39.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 655/1025]              blk.39.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 656/1025]            blk.39.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 657/1025]               blk.39.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 658/1025]          blk.39.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 659/1025]               blk.39.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 660/1025]              blk.39.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 661/1025]          blk.39.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 662/1025]         blk.39.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 663/1025]          blk.39.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 664/1025]           blk.39.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 665/1025]         blk.39.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 666/1025]               blk.39.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 667/1025]            blk.39.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 668/1025]           blk.39.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 669/1025]          blk.40.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 670/1025]         blk.40.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 671/1025]              blk.40.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 672/1025]              blk.40.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 673/1025]            blk.40.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 674/1025]               blk.40.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 675/1025]          blk.40.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 676/1025]               blk.40.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 677/1025]              blk.40.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 678/1025]          blk.40.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 679/1025]         blk.40.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 680/1025]          blk.40.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 681/1025]           blk.40.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 682/1025]         blk.40.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 683/1025]               blk.40.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 684/1025]            blk.40.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 685/1025]           blk.40.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 686/1025]          blk.41.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 687/1025]         blk.41.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 688/1025]              blk.41.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 689/1025]              blk.41.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 690/1025]            blk.41.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 691/1025]               blk.41.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 692/1025]          blk.41.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 693/1025]               blk.41.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 694/1025]              blk.41.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 695/1025]          blk.41.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 696/1025]         blk.41.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 697/1025]          blk.41.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 698/1025]           blk.41.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 699/1025]         blk.41.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 700/1025]               blk.41.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 701/1025]            blk.41.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 702/1025]           blk.41.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 703/1025]          blk.42.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 704/1025]         blk.42.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 705/1025]              blk.42.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 706/1025]              blk.42.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 707/1025]            blk.42.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 708/1025]               blk.42.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 709/1025]          blk.42.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 710/1025]               blk.42.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 711/1025]              blk.42.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 712/1025]          blk.42.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 713/1025]         blk.42.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 714/1025]          blk.42.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 715/1025]           blk.42.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 716/1025]         blk.42.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 717/1025]               blk.42.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 718/1025]            blk.42.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 719/1025]           blk.42.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 720/1025]          blk.43.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 721/1025]         blk.43.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 722/1025]              blk.43.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 723/1025]              blk.43.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 724/1025]            blk.43.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 725/1025]               blk.43.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 726/1025]          blk.43.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 727/1025]               blk.43.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 728/1025]              blk.43.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 729/1025]          blk.43.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 730/1025]         blk.43.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 731/1025]          blk.43.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 732/1025]           blk.43.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 733/1025]         blk.43.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 734/1025]               blk.43.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 735/1025]            blk.43.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 736/1025]           blk.43.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 737/1025]          blk.44.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 738/1025]         blk.44.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 739/1025]              blk.44.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 740/1025]              blk.44.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 741/1025]            blk.44.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 742/1025]               blk.44.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 743/1025]          blk.44.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 744/1025]               blk.44.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 745/1025]              blk.44.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 746/1025]          blk.44.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 747/1025]         blk.44.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 748/1025]          blk.44.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 749/1025]           blk.44.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 750/1025]         blk.44.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 751/1025]               blk.44.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 752/1025]            blk.44.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 753/1025]           blk.44.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 754/1025]          blk.45.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 755/1025]         blk.45.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 756/1025]              blk.45.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 757/1025]              blk.45.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 758/1025]            blk.45.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 759/1025]               blk.45.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 760/1025]          blk.45.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 761/1025]               blk.45.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 762/1025]              blk.45.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 763/1025]          blk.45.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 764/1025]         blk.45.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 765/1025]          blk.45.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 766/1025]           blk.45.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 767/1025]         blk.45.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 768/1025]               blk.45.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 769/1025]            blk.45.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 770/1025]           blk.45.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 771/1025]          blk.46.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 772/1025]         blk.46.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 773/1025]              blk.46.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 774/1025]              blk.46.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 775/1025]            blk.46.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 776/1025]               blk.46.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 777/1025]          blk.46.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 778/1025]               blk.46.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 779/1025]              blk.46.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 780/1025]          blk.46.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 781/1025]         blk.46.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 782/1025]          blk.46.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 783/1025]           blk.46.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 784/1025]         blk.46.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 785/1025]               blk.46.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 786/1025]            blk.46.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 787/1025]           blk.46.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 788/1025]          blk.47.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 789/1025]         blk.47.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 790/1025]              blk.47.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 791/1025]              blk.47.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 792/1025]            blk.47.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 793/1025]               blk.47.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 794/1025]          blk.47.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 795/1025]               blk.47.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 796/1025]              blk.47.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 797/1025]          blk.47.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 798/1025]         blk.47.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 799/1025]          blk.47.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 800/1025]           blk.47.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 801/1025]         blk.47.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 802/1025]               blk.47.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 803/1025]            blk.47.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 804/1025]           blk.47.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 805/1025]          blk.48.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 806/1025]         blk.48.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 807/1025]              blk.48.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 808/1025]              blk.48.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 809/1025]            blk.48.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 810/1025]               blk.48.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 811/1025]          blk.48.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 812/1025]               blk.48.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 813/1025]              blk.48.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 814/1025]          blk.48.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 815/1025]         blk.48.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 816/1025]          blk.48.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 817/1025]           blk.48.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 818/1025]         blk.48.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 819/1025]               blk.48.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 820/1025]            blk.48.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 821/1025]           blk.48.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 822/1025]          blk.49.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 823/1025]         blk.49.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 824/1025]              blk.49.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 825/1025]              blk.49.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 826/1025]            blk.49.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 827/1025]               blk.49.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 828/1025]          blk.49.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 829/1025]               blk.49.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 830/1025]              blk.49.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 831/1025]          blk.49.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 832/1025]         blk.49.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 833/1025]          blk.49.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 834/1025]           blk.49.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 835/1025]         blk.49.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 836/1025]               blk.49.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 837/1025]            blk.49.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 838/1025]           blk.49.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 839/1025]          blk.50.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 840/1025]         blk.50.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 841/1025]              blk.50.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 842/1025]              blk.50.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 843/1025]            blk.50.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 844/1025]               blk.50.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 845/1025]          blk.50.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 846/1025]               blk.50.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 847/1025]              blk.50.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 848/1025]          blk.50.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 849/1025]         blk.50.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 850/1025]          blk.50.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 851/1025]           blk.50.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 852/1025]         blk.50.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 853/1025]               blk.50.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 854/1025]            blk.50.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 855/1025]           blk.50.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 856/1025]          blk.51.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 857/1025]         blk.51.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 858/1025]              blk.51.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 859/1025]              blk.51.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 860/1025]            blk.51.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 861/1025]               blk.51.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 862/1025]          blk.51.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 863/1025]               blk.51.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 864/1025]              blk.51.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 865/1025]          blk.51.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 866/1025]         blk.51.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 867/1025]          blk.51.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 868/1025]           blk.51.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 869/1025]         blk.51.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 870/1025]               blk.51.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 871/1025]            blk.51.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 872/1025]           blk.51.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 873/1025]          blk.52.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 874/1025]         blk.52.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 875/1025]              blk.52.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 876/1025]              blk.52.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 877/1025]            blk.52.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 878/1025]               blk.52.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 879/1025]          blk.52.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 880/1025]               blk.52.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 881/1025]              blk.52.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 882/1025]          blk.52.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 883/1025]         blk.52.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 884/1025]          blk.52.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 885/1025]           blk.52.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 886/1025]         blk.52.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 887/1025]               blk.52.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 888/1025]            blk.52.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 889/1025]           blk.52.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 890/1025]          blk.53.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 891/1025]         blk.53.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 892/1025]              blk.53.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 893/1025]              blk.53.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 894/1025]            blk.53.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 895/1025]               blk.53.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 896/1025]          blk.53.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 897/1025]               blk.53.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 898/1025]              blk.53.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 899/1025]          blk.53.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 900/1025]         blk.53.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 901/1025]          blk.53.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 902/1025]           blk.53.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 903/1025]         blk.53.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 904/1025]               blk.53.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 905/1025]            blk.53.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 906/1025]           blk.53.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 907/1025]          blk.54.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 908/1025]         blk.54.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 909/1025]              blk.54.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 910/1025]              blk.54.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 911/1025]            blk.54.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 912/1025]               blk.54.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 913/1025]          blk.54.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 914/1025]               blk.54.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 915/1025]              blk.54.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 916/1025]          blk.54.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 917/1025]         blk.54.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 918/1025]          blk.54.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 919/1025]           blk.54.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 920/1025]         blk.54.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 921/1025]               blk.54.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 922/1025]            blk.54.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 923/1025]           blk.54.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 924/1025]          blk.55.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 925/1025]         blk.55.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 926/1025]              blk.55.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 927/1025]              blk.55.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 928/1025]            blk.55.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 929/1025]               blk.55.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 930/1025]          blk.55.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 931/1025]               blk.55.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 932/1025]              blk.55.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 933/1025]          blk.55.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 934/1025]         blk.55.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 935/1025]          blk.55.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 936/1025]           blk.55.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 937/1025]         blk.55.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 938/1025]               blk.55.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 939/1025]            blk.55.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 940/1025]           blk.55.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 941/1025]          blk.56.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 942/1025]         blk.56.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 943/1025]              blk.56.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 944/1025]              blk.56.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 945/1025]            blk.56.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 946/1025]               blk.56.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 947/1025]          blk.56.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 948/1025]               blk.56.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 949/1025]              blk.56.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 950/1025]          blk.56.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 951/1025]         blk.56.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 952/1025]          blk.56.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 953/1025]           blk.56.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 954/1025]         blk.56.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 955/1025]               blk.56.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 956/1025]            blk.56.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 957/1025]           blk.56.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 958/1025]          blk.57.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 959/1025]         blk.57.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 960/1025]              blk.57.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 961/1025]              blk.57.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 962/1025]            blk.57.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 963/1025]               blk.57.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 964/1025]          blk.57.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 965/1025]               blk.57.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 966/1025]              blk.57.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 967/1025]          blk.57.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 968/1025]         blk.57.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 969/1025]          blk.57.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 970/1025]           blk.57.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 971/1025]         blk.57.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 972/1025]               blk.57.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 973/1025]            blk.57.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 974/1025]           blk.57.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 975/1025]          blk.58.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 976/1025]         blk.58.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 977/1025]              blk.58.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 978/1025]              blk.58.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 979/1025]            blk.58.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 980/1025]               blk.58.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 981/1025]          blk.58.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 982/1025]               blk.58.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[ 983/1025]              blk.58.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[ 984/1025]          blk.58.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[ 985/1025]         blk.58.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[ 986/1025]          blk.58.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 987/1025]           blk.58.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[ 988/1025]         blk.58.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 989/1025]               blk.58.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 990/1025]            blk.58.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[ 991/1025]           blk.58.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[ 992/1025]          blk.59.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[ 993/1025]         blk.59.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[ 994/1025]              blk.59.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[ 995/1025]              blk.59.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[ 996/1025]            blk.59.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[ 997/1025]               blk.59.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[ 998/1025]          blk.59.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[ 999/1025]               blk.59.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[1000/1025]              blk.59.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[1001/1025]          blk.59.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[1002/1025]         blk.59.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[1003/1025]          blk.59.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[1004/1025]           blk.59.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[1005/1025]         blk.59.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[1006/1025]               blk.59.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[1007/1025]            blk.59.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[1008/1025]           blk.59.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[1009/1025]          blk.60.attn_kv_a_mqa.weight - [ 7168,   576,     1,     1], type =   q4_K, size =    2.215 MB, type = q4_k_r4
[1010/1025]         blk.60.attn_kv_a_norm.weight - [  512,     1,     1,     1], type =    f32, size =    0.002 MB, type = f32
[1011/1025]              blk.60.attn_kv_b.weight - [  512, 32768,     1,     1], type =   q4_K, size =    9.000 MB, type = q4_k_r4
[1012/1025]              blk.60.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[1013/1025]            blk.60.attn_output.weight - [16384,  7168,     1,     1], type =   q4_K, size =   63.000 MB, type = q4_k_r4
[1014/1025]               blk.60.attn_q_a.weight - [ 7168,  1536,     1,     1], type =   q4_K, size =    5.906 MB, type = q4_k_r4
[1015/1025]          blk.60.attn_q_a_norm.weight - [ 1536,     1,     1,     1], type =    f32, size =    0.006 MB, type = f32
[1016/1025]               blk.60.attn_q_b.weight - [ 1536, 24576,     1,     1], type =   q4_K, size =   20.250 MB, type = q4_k_r4
[1017/1025]              blk.60.exp_probs_b.bias - [  256,     1,     1,     1], type =    f32, size =    0.001 MB, type = f32
[1018/1025]          blk.60.ffn_down_exps.weight - [ 2048,  7168,   256,     1], type =   q6_K, size = 2940.000 MB, type = q6_k_r4
[1019/1025]         blk.60.ffn_down_shexp.weight - [ 2048,  7168,     1,     1], type =   q6_K, size =   11.484 MB, type = q6_k_r4
[1020/1025]          blk.60.ffn_gate_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[1021/1025]           blk.60.ffn_gate_inp.weight - [ 7168,   256,     1,     1], type =    f32, size =    7.000 MB, type = f32
[1022/1025]         blk.60.ffn_gate_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
[1023/1025]               blk.60.ffn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB, type = f32
[1024/1025]            blk.60.ffn_up_exps.weight - [ 7168,  2048,   256,     1], type =   q4_K, size = 2016.000 MB, type = q4_k_r4
[1025/1025]           blk.60.ffn_up_shexp.weight - [ 7168,  2048,     1,     1], type =   q4_K, size =    7.875 MB, type = q4_k_r4
llama_model_quantize_internal: model size  = 385689.62 MB
llama_model_quantize_internal: quant size  = 385689.62 MB
===================== Model ftype: Q4_K - Medium: Repacked ftype: Q4_K_R4

main: quantize time = 724052.06 ms
main:    total time = 724052.06 ms

$ du -c /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/*.gguf
47206828        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00001-of-00009.gguf
48270904        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00002-of-00009.gguf
48366528        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00003-of-00009.gguf
47141132        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00004-of-00009.gguf
48263708        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00005-of-00009.gguf
47141132        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00006-of-00009.gguf
48270904        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00007-of-00009.gguf
45838656        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00008-of-00009.gguf
14451648        /mnt/ai/models/unsloth/DeepSeek-R1-GGUF/DeepSeek-R1-Q4_K_M/DeepSeek-R1-Q4_K_M-00009-of-00009.gguf
394951440       total

$ ls -la /mnt/ai/models/unsloth/repack/DeepSeek-R1-Q4_K_R4.gguf
-rw-rw-r-- 1 j j 404430186592 Mar 20 15:32 /mnt/ai/models/unsloth/repack/DeepSeek-R1-Q4_K_R4.gguf