mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-02-05 14:00:10 +00:00
* convert_hf_to_gguf for Kimi-K2-Instruct Adapt mainline `PR14653` for tokenizer while maintaining proper MLA tensors. Tested with this workflow using deepseek fp8_cast_bf16.py and triton-cpu to upcast the fp8 safetensors to bf16 safetensors then used this convert_hf_to_gguf. * Add Kimi-K2 chat template moonshotai/Kimi-K2-Instruct https://github.com/ikawrakow/ik_llama.cpp/pull/609#issuecomment-3071259454 * kimi-k2 add ass to template to get response