ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-13 07:20:15 +00:00

Files

Kawrakow 2e585d4508 Enable faster prompt processing with mainline llama.cpp GGUFs (#409 )

* Enable MLA-3 in crippled GGUFs: WIP

* Enable MLA-3 in crippled GGUFs: seems to work

* Add newly created tensors to model.tensors_by_name

Else they don't get run-time repacked.

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-05-12 07:49:51 +03:00

llama.h

Enable faster prompt processing with mainline llama.cpp GGUFs (#409 )

2025-05-12 07:49:51 +03:00