Files
ik_llama.cpp/common
Kawrakow 2e585d4508 Enable faster prompt processing with mainline llama.cpp GGUFs (#409)
* Enable MLA-3 in crippled GGUFs: WIP

* Enable MLA-3 in crippled GGUFs: seems to work

* Add newly created tensors to model.tensors_by_name

Else they don't get run-time repacked.

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-05-12 07:49:51 +03:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-05-12 07:47:46 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2023-11-13 14:16:23 +02:00