ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-01-26 17:20:01 +00:00

Author	SHA1	Message	Date
firecoperana	a750d3aa03	Fix log issue for llama-cli (#1071 ) Co-authored-by: firecoperana <firecoperana>	2025-12-16 18:12:16 +01:00
firecoperana	0e91b89cd3	Refactor chat and server file (#1062 ) * Add alternative log functions * chat: fix int overflow, prevent size calculation in float/double (#17357) * chat: fix int overflow, prevent size calculation in float/double * Update common/chat.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * common : move all common_chat_parse_* to chat-parser.cpp. (#17481) # Conflicts: # common/chat.cpp * server: split server.cpp code into server/common/task/queue/context * Fix compiler warning * Clean up code * common: use native MultiByteToWideChar * move server prompt to server task * Clean code * delete utils.hpp --------- Co-authored-by: firecoperana <firecoperana> Co-authored-by: Xuan-Son Nguyen <son@huggingface.co> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: DAN™ <dranger003@gmail.com>	2025-12-15 08:27:20 +01:00
firecoperana	d7882c3cf8	Tool calls support from mainline (#723 ) * Tool calls support from mainline * update cmake * revert api for /completions * Fix broken thinking process for gpt-oss * add missing args and fix webui bugs * add missing args and fix webui bugs2 * Fix reasoning format error * add usage * change default post_sampling_probs to true * add back generated_text * Remove server endpoints tests * add log * Chat fixes * Remove logs * webui: revert extra handling of thinking process --------- Co-authored-by: firecoperana <firecoperana> Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2025-09-01 08:38:49 +03:00
Kawrakow	0ceeb11721	Merge mainline llama.cpp (#3 ) * Merging mainline - WIP * Merging mainline - WIP AVX2 and CUDA appear to work. CUDA performance seems slightly (~1-2%) lower as it is so often the case with llama.cpp/ggml after some "improvements" have been made. * Merging mainline - fix Metal * Remove check --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-07-27 07:55:01 +02:00
Max Krasnyansky	5cc8a89c08	Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (#7191 ) * logging: add proper checks for clang to avoid errors and warnings with VA_ARGS * build: add CMake Presets and toolchian files for Windows ARM64 * matmul-int8: enable matmul-int8 with MSVC and fix Clang warnings * ci: add support for optimized Windows ARM64 builds with MSVC and LLVM * matmul-int8: fixed typos in q8_0_q8_0 matmuls Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * matmul-int8: remove unnecessary casts in q8_0_q8_0 --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-05-16 12:47:36 +10:00
Andrew Downing	5bc5f1f361	Update LOG_IMPL and LOG_TEE_IMPL (#7029 ) ROCm clang defines _MSC_VER which results in the wrong implementation of LOG_IMPL and LOG_TEE_IMPL being compiled. This fixes https://github.com/ggerganov/llama.cpp/issues/6972	2024-05-01 23:31:30 +02:00
mgroeber9110	cb34841313	Replace "alternative" boolean operator in conditional compilation directive (#6949 )	2024-04-27 21:02:06 +02:00
Neo Zhang Jianyu	41c513730a	[SYCL] fix SYCL backend build on windows is break by LOG() error (#6290 ) * fix LOG() error for SYCL, enhance erro check by CI * rollback to bash * add newline at end of file	2024-03-25 15:52:41 +08:00
Minsoo Cheong	60dfbb2b55	examples : add "retrieval" (#6193 ) * add `retrieval` example * add README * minor fixes * cast filepos on print * remove use of variable sized array * store similarities in separate vector * print error on insufficient batch size * fix error message printing * assign n_batch value to n_ubatch * fix param definitions * define retrieval-only parameters in retrieval.cpp * fix `--context-file` option to be provided multiple times for multiple files * use vector for `query_emb` * add usage description in README * fix merge conflict * fix usage printing * remove seed setting * fix lint * increase file read buffer size * retrieval : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-03-25 09:38:22 +02:00
UEXTM.com	4a69b527db	log : fix MSVC compile errors (#5643 ) MSVC gives the following error with the existing macros: `Error C2059 : syntax error: ','` This patch adds `##` as a prefix to `__VA_ARGS__` to address this error.	2024-03-08 11:35:04 +02:00
Richard Kiss	1249c107d3	english : use `typos` to fix comments and logs (#4354 )	2023-12-12 11:53:36 +02:00
staviq	d722ed0a43	log : make generating separate log files optional (#3787 ) * impl --log-new, --log-append * Update common/log.h Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> * Update common/log.h Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> * Apply suggestions from code review Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> --------- Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>	2023-11-01 16:18:27 +02:00
Georgi Gerganov	8a2ce10daa	log : disable pid in log filenames	2023-10-25 10:09:16 +03:00
Georgi Gerganov	57dbdbdc54	speculative : add tree-based sampling example (#3624 ) * sampling : one sequence per sampling context ggml-ci * speculative : add tree-based sampling support ggml-ci * speculative : reuse the n_parallel CLI param * speculative : refactor sampling * examples : fix build after sampling refactoring ggml-ci * batched : fix n_seq_id * sampling : fix malloc ggml-ci * swift : fix build ggml-ci * swift : try to fix build ggml-ci * prompts : add assistant.txt * common : add llama_batch_add() and llama_batch_clear() helpers * speculative : minor refactor ggml-ci * minor : comments + rename ggml-ci * speculative : fix off-by-one for n_drafted * speculative : fix the n_drafted fix + p constants	2023-10-18 16:21:57 +03:00
Cebtenzzre	4cc4f84aea	build : enable more non-default compiler warnings (#3200 )	2023-09-28 17:41:44 -04:00
Cebtenzzre	ed5a405c22	examples : replace fprintf to stdout with printf (#3017 )	2023-09-05 15:10:27 -04:00
Kerfuffle	9f664f66a4	logging: Fix creating empty file even when disabled (#2966 ) * logging: Fix creating empty file even when disabled * Minor formatting fix Co-authored-by: staviq <staviq@gmail.com> --------- Co-authored-by: staviq <staviq@gmail.com>	2023-09-02 11:53:55 -06:00
staviq	2f55c84496	logs : fix mingw-like builds (fixes #2898 ) (#2911 ) * fix mingw-like builds * formatting * make LOG_COMPAT easier to override and extend * simplify win detection * fix for #2940	2023-09-01 12:07:06 +03:00
staviq	a2588b53e1	main : log file (#2748 ) * initial, base LOG macro * add .log to .gitignore added basic log file handler * reverted log auto endline to better mimic printf * remove atomics and add dynamic log target * log_enable/disable, LOG_TEE, basic usage doc * update .gitignore * mv include to common, params, help msg * log tostring helpers, token vectors pretty prints * main: replaced fprintf/LOG_TEE, some trace logging * LOG_DISABLE_LOGS compile flag, wrapped f in macros * fix LOG_TEELN and configchecker * stub LOG_DUMP_CMDLINE for WIN32 for now * fix msvc * cleanup main.cpp:273 * fix stray whitespace after master sync * log : fix compile warnings - do not use C++20 stuff - use PRIu64 to print uint64_t - avoid string copies by using const ref - fix ", ##__VA_ARGS__" warnings - compare strings with == and != * log : do not append to existing log + disable file line func by default * log : try to fix Windows build * main : wip logs * main : add trace log * review: macro f lowercase, str append to sstream * review: simplify ifs and str comparisons * fix MSVC, formatting, FMT/VAL placeholders * review: if/else cleanup * review: if/else cleanup (2) * replace _ prefix with _impl suffix --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-30 09:29:32 +03:00

19 Commits