### 📝 [#498](https://github.com/ikawrakow/ik_llama.cpp/issues/498) - question: about quantize method

| **Author** | `nigelzzz` |
| :--- | :--- |
| **State** | ❌ **Closed** |
| **Created** | 2025-06-06 |
| **Updated** | 2025-06-14 |

---

#### Description

Hi,
  the project is amazing and interesting, looks like it better thank origin llama.cpp.

I would like to study the repo and because there are a lot of quantize method same with origin llama.cpp, can i know how to choose quantize method to study.

my env is rpi5, and i often test bitnet and llama3.2 1b or 3b. 

thanks

---

#### 💬 Conversation

👤 **ikawrakow** commented the **2025-06-06** at **16:39:00**:<br>

For BitNet take a look at `IQ1_BN` and `IQ2_BN`. The packing in `IQ2_BN` is simpler and easier to understand, but uses 2 bits per weight. `IQ1_BN` uses 1.625 bits per weight, which is very close to the theoretical 1.58 bits for a ternary data type. 

Otherwise not sure what to recommend. Any of the quantization types should be OK for LlaMA-3.1-1B/3B on Rpi5. If you are new to the subject, it might be better to look into the simpler quantization types (e.g., `QX_K`) first.

---

👤 **aezendc** commented the **2025-06-09** at **10:48:49**:<br>

> For BitNet take a look at `IQ1_BN` and `IQ2_BN`. The packing in `IQ2_BN` is simpler and easier to understand, but uses 2 bits per weight. `IQ1_BN` uses 1.625 bits per weight, which is very close to the theoretical 1.58 bits for a ternary data type.
> 
> Otherwise not sure what to recommend. Any of the quantization types should be OK for LlaMA-3.1-1B/3B on Rpi5. If you are new to the subject, it might be better to look into the simpler quantization types (e.g., `QX_K`) first.

I like the iq1_bn quantize. Its good and I am using it. Is there a way we can make this support function calling?

---

👤 **ikawrakow** commented the **2025-06-09** at **11:01:33**:<br>

See #407

---

👤 **ikawrakow** commented the **2025-06-14** at **12:01:58**:<br>

I think we can close it.