mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-01-26 17:20:01 +00:00
544 B
544 B
🔀 #69 - Allow bf16 kv-cache
| Author | ikawrakow |
|---|---|
| State | ❌ Closed |
| Created | 2024-09-29 |
| Updated | 2024-09-29 |
Description
On the CPU I get the exact same PPL with and without FA using bf16 for kv-cache. But on CUDA the bf16 kv-cache result is about the same as the fp16 kv-cache CPU result, so I'm missing some conversion somewhere. Either way, we can now run on all platforms supported here with bf16 kv-cache.