mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-04-30 19:31:48 +00:00
Refactor file llama.cpp (#823)
* llama_model and llama_hparams * llama_build_context Surprisingly small reduction in llama.cpp compile time given the reduction in LOCs (22k -> 14k) * LLM_TN llama.cpp compilation: 50 s -> 33 s * llama_quantize * arch names * All graph building is now in llm-build-context.cpp * hparams loading llama.cpp is now just 9300 LOC, but still takes 32 seconds to compile. * We are now at 6 seconds to build the src folder * load -> create We are not actually loading the tensors, but just creating them. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
This commit is contained in:
@@ -298,3 +298,5 @@ enum llm_tensor {
|
||||
};
|
||||
|
||||
llm_arch llm_arch_from_string(const std::string & name);
|
||||
|
||||
const char * llama_model_arch_name(llm_arch arch);
|
||||
|
||||
Reference in New Issue
Block a user