Files
ik_llama.cpp/github-data/pull_requests/64 - Better sub-3-bit quantization mixes with a qkv tensor.md
2025-07-23 13:31:53 +02:00

589 B

🔀 #64 - Better sub-3-bit quantization mixes with a qkv tensor

Author ikawrakow
State Closed
Created 2024-09-28
Updated 2024-09-28

Description

Phi3.5-mini uses a combined QKV tensor. As a result, the quantization mix strategies used for sub-3-bit quants fail. This PR fixes it, and here is what we get as quantization error using wiki text perplexity

iphi3 5_ppl