Files
ik_llama.cpp/github-data/pull_requests/391 - Fix DeepSeek q8_0 cache.md
2025-07-23 13:31:53 +02:00

17 lines
606 B
Markdown

### 🐛 [#391](https://github.com/ikawrakow/ik_llama.cpp/pull/391) - Fix DeepSeek q8_0 cache
| **Author** | `ikawrakow` |
| :--- | :--- |
| **State** | ❌ **Closed** |
| **Created** | 2025-05-07 |
| **Updated** | 2025-05-07 |
---
#### Description
Nobody has used `ik_llama.cpp` with a DeepSeek model and `Q8_0` KV cache since PR #351?
This PR fixes the assert one gets when one tries to use a DeepSeek model on the CPU using `Q8_0` KV cache.
Also, it seems the optimization I added in #351 to repack the `K` cache to `Q8_0_R8` seems to lower TG performance for DeepSeek models, so disabling it.