mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-01-26 17:20:01 +00:00
589 B
589 B
🔀 #64 - Better sub-3-bit quantization mixes with a qkv tensor
| Author | ikawrakow |
|---|---|
| State | ❌ Closed |
| Created | 2024-09-28 |
| Updated | 2024-09-28 |
Description
Phi3.5-mini uses a combined QKV tensor. As a result, the quantization mix strategies used for sub-3-bit quants fail. This PR fixes it, and here is what we get as quantization error using wiki text perplexity