ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-22 14:14:32 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	e27ab8cb60	Update hot topics - RMSnorm	2023-03-16 07:12:12 +02:00
Nebula	1b96142bae	Fix RMS norm in GGML (#191 )	2023-03-15 19:29:25 -04:00
hoangmit	12b9bd9b13	Add RMS norm and use it (#187 ) * add ggml_rms_norm * update op num	2023-03-16 00:41:38 +02:00
moritzbrantner	3ffbb46e32	fixed typo (#178 )	2023-03-15 22:35:25 +02:00
Rickey Bowers Jr	f88e2693cc	add SIGINT support for _WIN32 environments (#120 ) * add SIGINT support for _WIN32 environments * perhaps more consistent	2023-03-15 21:56:24 +02:00
Justin Suess	a4d17b7096	added ctx_size parameter (#148 ) * added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-15 21:42:40 +02:00
Justin Suess	3d4b93a8d4	fixed color reset on exit (#149 ) * fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-15 21:39:38 +02:00
Musab Gultekin	3a59f2ef9b	Fix potential licensing issue (#126 ) * Update README.md * Update README.md remove facebook	2023-03-15 21:39:06 +02:00
Ronsor	dba21f1c6f	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142 ) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.	2023-03-15 21:37:50 +02:00
hoangmit	735b1a2aaa	inline -> static inline for "bytesFromNibbles" (#161 ) Without "static" prefix, it fails to compile in clang	2023-03-15 21:05:14 +02:00
Ronsor	55f8043b2f	Don't use vdotq_s32 if it's not available (#139 ) * Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in `84d9015` if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-14 21:34:37 +02:00
Radoslav Gerganov	1db7851d94	Add section to README on how to run the project on Android (#130 )	2023-03-14 15:30:08 +02:00
Georgi Gerganov	96d900134f	Add Misc section + update hot topics + minor fixes	2023-03-14 09:43:52 +02:00
Sebastián A	7925ae2017	Add windows to the CI (#98 )	2023-03-13 22:29:10 +02:00
Georgi Gerganov	3e39a552f6	CMake build in Release by default (#75 )	2023-03-13 21:22:15 +02:00
Georgi Gerganov	3cd7c8e227	Update contribution section, hot topics, limitations, etc.	2023-03-13 19:21:51 +02:00
Georgi Gerganov	222ee5f918	Print system information	2023-03-13 19:15:08 +02:00
Sebastián A	4acda08f42	Initial support for CMake (#75 )	2023-03-13 19:12:33 +02:00
Thomas Klausner	d3ed019b74	Add NetBSD support. (#90 )	2023-03-13 18:40:54 +02:00
Pavol Rusnak	e429f5b9e0	Use fprintf for diagnostic output (#48 ) keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output	2023-03-13 18:39:56 +02:00
Georgi Gerganov	c1eebc2a25	Use vdotq_s32 to improve performance (#67 ) * 10% performance boost on ARM * Back to original change	2023-03-13 18:36:44 +02:00
uint256_t	a81c113197	Reduce model loading time (#43 ) * Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-13 18:33:43 +02:00
Val Kharitonov	d35d36dff8	Fix UTF-8 handling (including colors) (#79 )	2023-03-13 18:24:18 +02:00
Pavol Rusnak	b84a31d659	Add quantize script for batch quantization (#92 ) * Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-13 18:15:20 +02:00
Georgi Gerganov	67d50a97b4	Add initial contribution guidelines	2023-03-13 09:42:26 +02:00
Matvey Soloviev	00be0e42e4	Gate signal support on being on a unixoid system. (#74 )	2023-03-13 04:08:01 +01:00
Matvey Soloviev	a30749e299	Fix token count accounting	2023-03-13 01:04:41 +01:00
Georgi Gerganov	49a8c7675b	Revert "10% performance boost on ARM" This reverts commit `113a9e83eb`. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve	2023-03-13 01:28:08 +02:00
Georgi Gerganov	c47fa0ea5e	Check for vdotq_s32 availability	2023-03-13 01:21:03 +02:00
Georgi Gerganov	c00675331e	Ammend to previous commit - forgot to update non-QRDMX branch	2023-03-13 01:05:24 +02:00
Georgi Gerganov	f48b7628ea	10% performance boost on ARM	2023-03-13 00:56:10 +02:00
Matvey Soloviev	fedc405b41	Fix color getting reset before prompt output done (#65 ) (cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)	2023-03-13 00:07:34 +02:00
Georgi Gerganov	c240cd1e05	Update README.md	2023-03-12 23:39:01 +02:00
Matvey Soloviev	d35528087e	Add interactive mode (#61 ) * Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build	2023-03-12 23:13:28 +02:00
Marc Köhlbrugge	8de246c2d8	Fix typo in README (#45 )	2023-03-12 22:30:08 +02:00
Ben Garney	7a708ee9b0	Allow using prompt files (#59 )	2023-03-12 22:28:36 +02:00
beiller	c763dc1bc2	Add back top_k (#56 ) * Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-12 22:23:15 +02:00
Sebastián A	fde84afbed	Windows fixes (#31 ) * Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.	2023-03-12 22:15:00 +02:00
Georgi Gerganov	f6f3f1c7c1	Update README.md	2023-03-12 22:09:26 +02:00
Georgi Gerganov	1f0283048a	Add CI (#60 )	2023-03-12 22:08:24 +02:00
Georgi Gerganov	85c71945cf	Revert "weights_only" arg - this causing more trouble than help	2023-03-12 20:59:01 +02:00
Oleksandr Nikitin	a7cf72d75e	python/pytorch compat notes (#44 )	2023-03-12 14:16:33 +02:00
beiller	a63a748bba	Add repetition penalty (#20 ) * Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-12 11:27:42 +02:00
Georgi Gerganov	dc91ec5d67	Clarify meaning of hacking	2023-03-12 09:03:25 +02:00
Georgi Gerganov	95fb97b137	README: add "Supported platforms" + update hot topics	2023-03-12 08:41:54 +02:00
deepdiffuser	a9b53a036b	use weights_only in conversion script (#32 ) this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries	2023-03-12 08:36:35 +02:00
Pavol Rusnak	865eff3820	Add LICENSE (#21 )	2023-03-12 08:36:03 +02:00
Georgi Gerganov	e34e3e21c4	Update README.md	2023-03-12 01:26:32 +02:00
Juraj Bednar	4cdcd39348	Fix a typo in model name (#16 )	2023-03-11 19:32:20 +02:00
Georgi Gerganov	284d9be2de	Update README.md	2023-03-11 18:10:18 +02:00

1 2

73 Commits