Files
ik_llama.cpp/llama.cpp
fairydreaming 27b040691c llama : use n_embd_head_v when reshaping kqv (#7327)
* llama : use n_embd_head_v instead of n_embd_head_k when reshaping kqv

* llama : use n_embd_v_gqa and n_embd_head_v instead of n_embd_k_gqa and n_embd_head_k when making a view of cached value vectors.

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
2024-05-17 14:24:38 +03:00

728 KiB