Add support for GLM-4.5 models (#668)

* GLM-4.5

* GLM-4.5

* GLM-4.5

* convert_hf_to_gguf.py compatibility bugfix with GLM-4.5

From @ubergarm - https://github.com/ikawrakow/ik_llama.cpp/pull/668#issuecomment-3145913701

* Add ubergarm comments + my own

* Revert to llama.cpp script version that produced good BF16

See: https://github.com/ikawrakow/ik_llama.cpp/pull/668#issuecomment-3147374559

* Support for jinja chat templates

See https://github.com/ikawrakow/ik_llama.cpp/pull/668#issuecomment-3148109962

* GLM-4.5 llama.cpp final port

* Handle TENSOR_SKIP

Ported the hanges from:

f129567dc0
dcbbd2cb05

Except op info since ik_llama.cpp doesn't support this operation.

* Bugfix for TENSOR_SKIP

skip loading if a tensor has the TENSOR_SKIP flag - @ubergarm via https://github.com/ikawrakow/ik_llama.cpp/pull/668#issuecomment-3155297198

* Update llama.cpp

Restore original GGLM_ASSERT

* Fix chat template detection

Changes suggested by @ubergarm - https://github.com/ikawrakow/ik_llama.cpp/pull/668#issuecomment-3155927840

* Revert to original GGML_ASSERT
This commit is contained in:
Thireus ☠
2025-08-07 05:55:00 +01:00
committed by GitHub
parent ddceb0a55d
commit d65d5fe29e
9 changed files with 1288 additions and 26 deletions

View File

@@ -1546,6 +1546,30 @@ llama_token llama_token_suffix_impl(const struct llama_vocab & vocab) {
return vocab.special_suffix_id;
}
llama_token llama_token_fim_pre_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_pre_id;
}
llama_token llama_token_fim_suf_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_suf_id;
}
llama_token llama_token_fim_mid_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_mid_id;
}
llama_token llama_token_fim_pad_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_pad_id;
}
llama_token llama_token_fim_rep_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_rep_id;
}
llama_token llama_token_fim_sep_impl(const struct llama_vocab & vocab) {
return vocab.special_fim_sep_id;
}
llama_token llama_token_eot_impl(const struct llama_vocab & vocab) {
return vocab.special_eot_id;
}