ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-02-08 15:30:15 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	e7a75316dc	Minor style changes	2023-03-21 18:10:32 +02:00
Georgi Gerganov	f57b30a8e1	Add chat.sh script	2023-03-21 18:09:46 +02:00
Georgi Gerganov	614b1afa1c	Fix convert script, warnings alpaca instructions, default params	2023-03-21 17:59:16 +02:00
Kevin Kwok	9c26af616b	Update IPFS links to quantized alpaca with new tokenizer format (#352 )	2023-03-21 17:34:49 +02:00
Mack Straight	60d93896be	sentencepiece bpe compatible tokenizer (#252 ) * potential out of bounds read * fix quantize * style * Update convert-pth-to-ggml.py * mild cleanup * don't need the space-prefixing here rn since main.cpp already does it * new file magic + version header field * readme notice * missing newlines Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>	2023-03-20 03:17:23 -07:00
Suaj Carrot	3eaf221dee	Improved quantize script (#222 ) * Improved quantize script I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility. * Fixes and improvements based on Matt's observations Fixed and improved many things in the script based on the reviews made by @mattsta. The parallelization suggestion is still to be revised, but code for it was still added (commented). * Small fixes to the previous commit * Corrected to use the original glob pattern The original Bash script uses a glob pattern to match files that have endings such as ...bin.0, ...bin.1, etc. That has been translated correctly to Python now. * Added support for Windows and updated README to use this script New code to set the name of the quantize script binary depending on the platform has been added (quantize.exe if working on Windows) and the README.md file has been updated to use this script instead of the Bash one. * Fixed a typo and removed shell=True in the subprocess.run call Fixed a typo regarding the new filenames of the quantized models and removed the shell=True parameter in the subprocess.run call as it was conflicting with the list of parameters. * Corrected previous commit * Small tweak: changed the name of the program in argparse This was making the automatic help message to be suggesting the program's usage as being literally "$ Quantization Script [arguments]". It should now be something like "$ python3 quantize.py [arguments]".	2023-03-19 20:38:44 +02:00
Georgi Gerganov	eed44b2875	Update hot topics to mention Alpaca support	2023-03-19 19:51:55 +02:00
Georgi Gerganov	b3bd91ce9d	Add instruction for using Alpaca (#240 )	2023-03-19 18:49:50 +02:00
Pavol Rusnak	f9cb6f8979	Fix typo in readme	2023-03-18 23:18:04 +01:00
Pavol Rusnak	cc4ace10bf	Add note about Python 3.11 to readme	2023-03-18 22:25:35 +01:00
Pavol Rusnak	0bfb4f160f	Add memory/disk requirements to readme	2023-03-18 22:25:35 +01:00
Georgi Gerganov	b57c1e4295	Update Contributing section	2023-03-17 20:30:04 +02:00
Stephan Walter	45113b2f42	Don't tell users to use a bad number of threads (#243 ) The readme tells people to use the command line option "-t 8", causing 8 threads to be started. On systems with fewer than 8 cores, this causes a significant slowdown. Remove the option from the example command lines and use /proc/cpuinfo on Linux to determine a sensible default.	2023-03-17 19:47:35 +02:00
Bernat Vadell	afcd16588e	🚀 Dockerize llamacpp (#132 ) * feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-17 10:47:06 +01:00
Georgi Gerganov	b73c11ea8c	Update README.md	2023-03-16 15:00:09 +02:00
Georgi Gerganov	be7fc04a86	Expand "Contributing" section	2023-03-16 08:55:13 +02:00
Georgi Gerganov	e27ab8cb60	Update hot topics - RMSnorm	2023-03-16 07:12:12 +02:00
moritzbrantner	3ffbb46e32	fixed typo (#178 )	2023-03-15 22:35:25 +02:00
Musab Gultekin	3a59f2ef9b	Fix potential licensing issue (#126 ) * Update README.md * Update README.md remove facebook	2023-03-15 21:39:06 +02:00
Radoslav Gerganov	1db7851d94	Add section to README on how to run the project on Android (#130 )	2023-03-14 15:30:08 +02:00
Georgi Gerganov	96d900134f	Add Misc section + update hot topics + minor fixes	2023-03-14 09:43:52 +02:00
Georgi Gerganov	3cd7c8e227	Update contribution section, hot topics, limitations, etc.	2023-03-13 19:21:51 +02:00
Pavol Rusnak	b84a31d659	Add quantize script for batch quantization (#92 ) * Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-13 18:15:20 +02:00
Georgi Gerganov	67d50a97b4	Add initial contribution guidelines	2023-03-13 09:42:26 +02:00
Georgi Gerganov	c240cd1e05	Update README.md	2023-03-12 23:39:01 +02:00
Matvey Soloviev	d35528087e	Add interactive mode (#61 ) * Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build	2023-03-12 23:13:28 +02:00
Marc Köhlbrugge	8de246c2d8	Fix typo in README (#45 )	2023-03-12 22:30:08 +02:00
Georgi Gerganov	f6f3f1c7c1	Update README.md	2023-03-12 22:09:26 +02:00
Georgi Gerganov	85c71945cf	Revert "weights_only" arg - this causing more trouble than help	2023-03-12 20:59:01 +02:00
Oleksandr Nikitin	a7cf72d75e	python/pytorch compat notes (#44 )	2023-03-12 14:16:33 +02:00
Georgi Gerganov	dc91ec5d67	Clarify meaning of hacking	2023-03-12 09:03:25 +02:00
Georgi Gerganov	95fb97b137	README: add "Supported platforms" + update hot topics	2023-03-12 08:41:54 +02:00
Georgi Gerganov	e34e3e21c4	Update README.md	2023-03-12 01:26:32 +02:00
Juraj Bednar	4cdcd39348	Fix a typo in model name (#16 )	2023-03-11 19:32:20 +02:00
Georgi Gerganov	284d9be2de	Update README.md	2023-03-11 18:10:18 +02:00
Georgi Gerganov	cc0f26bef3	Add AVX2 support for x86 architectures thanks to @Const-me !	2023-03-11 18:04:25 +02:00
Georgi Gerganov	35cb0d2a39	Update README.md	2023-03-11 12:31:21 +02:00
Georgi Gerganov	2d2cadab68	Update Makefile var + add comment	2023-03-11 12:27:02 +02:00
Georgi Gerganov	657074b014	Update README.md	2023-03-11 11:34:25 +02:00
Georgi Gerganov	b53c6356f3	Update README.md	2023-03-11 11:34:11 +02:00
Georgi Gerganov	a2799521b9	Support all LLaMA models + change Q4_0 quantization storage	2023-03-11 11:28:30 +02:00
Simon Willison	d4919344b1	Include Python dependencies in README (#6 )	2023-03-11 07:47:26 +02:00
Georgi Gerganov	11dae511e3	Update README.md	2023-03-11 01:30:47 +02:00
Georgi Gerganov	240b0bf6ea	Update README.md	2023-03-11 01:22:58 +02:00
Georgi Gerganov	87da10c739	Update README.md	2023-03-11 01:18:10 +02:00
Georgi Gerganov	01e3d38e1c	Update README.md	2023-03-11 00:55:22 +02:00
Georgi Gerganov	8d38e7e279	Update README.md	2023-03-11 00:51:46 +02:00
Georgi Gerganov	586e0f1f3d	Update README.md	2023-03-11 00:09:19 +02:00
Georgi Gerganov	4c7f13c170	Update README.md	2023-03-10 23:53:11 +02:00
Georgi Gerganov	44f3a5b932	Update README.md	2023-03-10 21:52:27 +02:00

1 2

52 Commits