ik_llama.cpp/llama.cpp at 748ed279106b9b6e5c5bf4b0cf6d23a4a6cfa323

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-21 15:09:40 +00:00

Files

Georgi Gerganov 110487aa7b llama : pad KV cache size (#4280 )

* llama : pad KV cache size to 32

* metal : try to improve batched decoding

2023-12-03 10:58:16 +02:00

View Raw