mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-03-06 03:50:08 +00:00
13 lines
736 B
Markdown
13 lines
736 B
Markdown
### 🔀 [#81](https://github.com/ikawrakow/ik_llama.cpp/pull/81) - Cleanup scale fudge factors
|
|
|
|
| **Author** | `ikawrakow` |
|
|
| :--- | :--- |
|
|
| **State** | ❌ **Closed** |
|
|
| **Created** | 2024-10-04 |
|
|
| **Updated** | 2024-10-04 |
|
|
|
|
---
|
|
|
|
#### Description
|
|
|
|
Low-bit quants often benefit from a fudge factor applied to the (super-)block scale. When I was developing `IQ2_K` and `IQ3_K` it was faster to change the fudge factor in `ggml-cuda/convert.cu` and recompile than to change it in the quantization function and re-quantize. But when I was ready, I forgot to move the `IQ2_K` and `IQ3_K` fudge factors to quantization, so they remained in the CUDA dequantization function (and hence weren't applied anywhere else). This PR fixes this. |