Files
ik_llama.cpp/ggml
Iwan Kawrakow 20d50172d0 Much better FA TG with q8_0 KV cache
Just repack it even for TG. But do the repacking for k_step rows,
not the whole K tensor.
2025-04-28 11:26:28 +03:00
..
2024-07-27 07:55:01 +02:00
2025-04-07 10:43:26 +02:00
2025-04-28 11:26:28 +03:00
2024-07-27 07:55:01 +02:00