Files
ik_llama.cpp/github-data/issues/217 - Bug_ CPU FA with fp16 K-cache is broken.md
2025-07-23 13:31:53 +02:00

591 B

🐛 #217 - Bug: CPU FA with fp16 K-cache is broken

Author ikawrakow
State Closed
Created 2025-02-21
Updated 2025-02-22

Description

What happened?

Running HellaSwag with flash attention enabled and using fp16 for K-cache produces much lower scores than no FA or FA using Q8_0 or bf16 for K-cache.

Name and Version

Latest

What operating system are you seeing the problem on?

No response

Relevant log output