ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-24 00:19:19 +00:00

Author	SHA1	Message	Date
Iwan Kawrakow	6b6d25bfbf	llama: factor out model loader	2025-08-13 12:12:46 +03:00
firecoperana	3f111ad7bb	add dry sampler (#513 ) * add dry sampler * use vocab instead of model in dry_init function * fix compile error for build test --------- Co-authored-by: firecoperana <firecoperana>	2025-06-19 10:24:53 +03:00
Kawrakow	a051f08b8f	Add copyright notices (#317 ) Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2025-04-07 10:43:26 +02:00
Kawrakow	1b789c983a	Time to fix replace_all (#68 ) Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-09-28 17:59:47 +03:00
Kawrakow	8f43e55103	Merge mainline - Aug 12 2024 (#17 ) * Merge mainline * Fix after merge * Remove CI check --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-08-12 15:14:32 +02:00
Kawrakow	154e0d75fc	Merge mainline llama.cpp (#3 ) * Merging mainline - WIP * Merging mainline - WIP AVX2 and CUDA appear to work. CUDA performance seems slightly (~1-2%) lower as it is so often the case with llama.cpp/ggml after some "improvements" have been made. * Merging mainline - fix Metal * Remove check --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-07-27 07:55:01 +02:00