Files
ik_llama.cpp/src
Kawrakow 4daff01b39 Refactor file llama.cpp (#823)
* llama_model and llama_hparams

* llama_build_context

Surprisingly small reduction in llama.cpp compile time given
the reduction in LOCs (22k -> 14k)

* LLM_TN

llama.cpp compilation: 50 s -> 33 s

* llama_quantize

* arch names

* All graph building is now in llm-build-context.cpp

* hparams loading

llama.cpp is now just 9300 LOC, but still takes 32 seconds to compile.

* We are now at 6 seconds to build the src folder

* load -> create

We are not actually loading the tensors, but just creating them.

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-10-11 11:35:20 +03:00
..
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-10-11 11:35:20 +03:00
2025-06-19 10:24:53 +03:00
2025-10-11 11:35:20 +03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00