ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-05 14:00:10 +00:00

Files

ubergarm 13b2f19372 kimi-k2 convert script and chat template (#612 )

* convert_hf_to_gguf for Kimi-K2-Instruct

Adapt mainline `PR14653` for tokenizer while maintaining proper MLA
tensors. Tested with this workflow using deepseek fp8_cast_bf16.py and
triton-cpu to upcast the fp8 safetensors to bf16 safetensors then used
this convert_hf_to_gguf.

* Add Kimi-K2 chat template

moonshotai/Kimi-K2-Instruct

https://github.com/ikawrakow/ik_llama.cpp/pull/609#issuecomment-3071259454

* kimi-k2 add ass to template to get response

2025-07-15 19:54:04 +02:00

CMakeLists.txt

Be able to repack tensors at run time (#147 )

2024-12-17 14:16:34 +01:00

llama-grammar.cpp

Merge mainline - Aug 12 2024 (#17 )

2024-08-12 15:14:32 +02:00

llama-grammar.h

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

llama-impl.h

add dry sampler (#513 )