Georgi Gerganov
e27ab8cb60
Update hot topics - RMSnorm
2023-03-16 07:12:12 +02:00
Nebula
1b96142bae
Fix RMS norm in GGML ( #191 )
2023-03-15 19:29:25 -04:00
hoangmit
12b9bd9b13
Add RMS norm and use it ( #187 )
...
* add ggml_rms_norm
* update op num
2023-03-16 00:41:38 +02:00
moritzbrantner
3ffbb46e32
fixed typo ( #178 )
2023-03-15 22:35:25 +02:00
Rickey Bowers Jr
f88e2693cc
add SIGINT support for _WIN32 environments ( #120 )
...
* add SIGINT support for _WIN32 environments
* perhaps more consistent
2023-03-15 21:56:24 +02:00
Justin Suess
a4d17b7096
added ctx_size parameter ( #148 )
...
* added ctx_size parameter
* added it in more places
* Apply suggestions from code review
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-15 21:42:40 +02:00
Justin Suess
3d4b93a8d4
fixed color reset on exit ( #149 )
...
* fixed color reset on exit
* added sigint handler for ansi_color_reset
* Update main.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-15 21:39:38 +02:00
Musab Gultekin
3a59f2ef9b
Fix potential licensing issue ( #126 )
...
* Update README.md
* Update README.md
remove facebook
2023-03-15 21:39:06 +02:00
Ronsor
dba21f1c6f
Use tokenizer.vocab_size() instead of hardcoding 32000 in convert-pth-to-ggml.py ( #142 )
...
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-15 21:37:50 +02:00
hoangmit
735b1a2aaa
inline -> static inline for "bytesFromNibbles" ( #161 )
...
Without "static" prefix, it fails to compile in clang
2023-03-15 21:05:14 +02:00
Ronsor
55f8043b2f
Don't use vdotq_s32 if it's not available ( #139 )
...
* Don't use vdotq_s32 if it's not available
`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.
Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.
* Update ggml.c
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-14 21:34:37 +02:00
Radoslav Gerganov
1db7851d94
Add section to README on how to run the project on Android ( #130 )
2023-03-14 15:30:08 +02:00
Georgi Gerganov
96d900134f
Add Misc section + update hot topics + minor fixes
2023-03-14 09:43:52 +02:00
Sebastián A
7925ae2017
Add windows to the CI ( #98 )
2023-03-13 22:29:10 +02:00
Georgi Gerganov
3e39a552f6
CMake build in Release by default ( #75 )
2023-03-13 21:22:15 +02:00
Georgi Gerganov
3cd7c8e227
Update contribution section, hot topics, limitations, etc.
2023-03-13 19:21:51 +02:00
Georgi Gerganov
222ee5f918
Print system information
2023-03-13 19:15:08 +02:00
Sebastián A
4acda08f42
Initial support for CMake ( #75 )
2023-03-13 19:12:33 +02:00
Thomas Klausner
d3ed019b74
Add NetBSD support. ( #90 )
2023-03-13 18:40:54 +02:00
Pavol Rusnak
e429f5b9e0
Use fprintf for diagnostic output ( #48 )
...
keep printf only for printing model output
one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13 18:39:56 +02:00
Georgi Gerganov
c1eebc2a25
Use vdotq_s32 to improve performance ( #67 )
...
* 10% performance boost on ARM
* Back to original change
2023-03-13 18:36:44 +02:00
uint256_t
a81c113197
Reduce model loading time ( #43 )
...
* Use buffering
* Use vector
* Minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-13 18:33:43 +02:00
Val Kharitonov
d35d36dff8
Fix UTF-8 handling (including colors) ( #79 )
2023-03-13 18:24:18 +02:00
Pavol Rusnak
b84a31d659
Add quantize script for batch quantization ( #92 )
...
* Add quantize script for batch quantization
* Indentation
* README for new quantize.sh
* Fix script name
* Fix file list on Mac OS
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-13 18:15:20 +02:00
Georgi Gerganov
67d50a97b4
Add initial contribution guidelines
2023-03-13 09:42:26 +02:00
Matvey Soloviev
00be0e42e4
Gate signal support on being on a unixoid system. ( #74 )
2023-03-13 04:08:01 +01:00
Matvey Soloviev
a30749e299
Fix token count accounting
2023-03-13 01:04:41 +01:00
Georgi Gerganov
49a8c7675b
Revert "10% performance boost on ARM"
...
This reverts commit 113a9e83eb .
There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
2023-03-13 01:28:08 +02:00
Georgi Gerganov
c47fa0ea5e
Check for vdotq_s32 availability
2023-03-13 01:21:03 +02:00
Georgi Gerganov
c00675331e
Ammend to previous commit - forgot to update non-QRDMX branch
2023-03-13 01:05:24 +02:00
Georgi Gerganov
f48b7628ea
10% performance boost on ARM
2023-03-13 00:56:10 +02:00
Matvey Soloviev
fedc405b41
Fix color getting reset before prompt output done ( #65 )
...
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
2023-03-13 00:07:34 +02:00
Georgi Gerganov
c240cd1e05
Update README.md
2023-03-12 23:39:01 +02:00
Matvey Soloviev
d35528087e
Add interactive mode ( #61 )
...
* Initial work on interactive mode.
* Improve interactive mode. Make rev. prompt optional.
* Update README to explain interactive mode.
* Fix OS X build
2023-03-12 23:13:28 +02:00
Marc Köhlbrugge
8de246c2d8
Fix typo in README ( #45 )
2023-03-12 22:30:08 +02:00
Ben Garney
7a708ee9b0
Allow using prompt files ( #59 )
2023-03-12 22:28:36 +02:00
beiller
c763dc1bc2
Add back top_k ( #56 )
...
* Add back top_k
* Update utils.cpp
* Update utils.h
---------
Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com >
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-12 22:23:15 +02:00
Sebastián A
fde84afbed
Windows fixes ( #31 )
...
* Apply fixes suggested to build on windows
Issue: https://github.com/ggerganov/llama.cpp/issues/22
* Remove unsupported VLAs
* MSVC: Remove features that are only available on MSVC C++20.
* Fix zero initialization of the other fields.
* Change the use of vector for stack allocations.
2023-03-12 22:15:00 +02:00
Georgi Gerganov
f6f3f1c7c1
Update README.md
2023-03-12 22:09:26 +02:00
Georgi Gerganov
1f0283048a
Add CI ( #60 )
2023-03-12 22:08:24 +02:00
Georgi Gerganov
85c71945cf
Revert "weights_only" arg - this causing more trouble than help
2023-03-12 20:59:01 +02:00
Oleksandr Nikitin
a7cf72d75e
python/pytorch compat notes ( #44 )
2023-03-12 14:16:33 +02:00
beiller
a63a748bba
Add repetition penalty ( #20 )
...
* Adding repeat penalization
* Update utils.h
* Update utils.cpp
* Numeric fix
Should probably still scale by temp even if penalized
* Update comments, more proper application
I see that numbers can go negative so a fix from a referenced commit
* Minor formatting
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2023-03-12 11:27:42 +02:00
Georgi Gerganov
dc91ec5d67
Clarify meaning of hacking
2023-03-12 09:03:25 +02:00
Georgi Gerganov
95fb97b137
README: add "Supported platforms" + update hot topics
2023-03-12 08:41:54 +02:00
deepdiffuser
a9b53a036b
use weights_only in conversion script ( #32 )
...
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
2023-03-12 08:36:35 +02:00
Pavol Rusnak
865eff3820
Add LICENSE ( #21 )
2023-03-12 08:36:03 +02:00
Georgi Gerganov
e34e3e21c4
Update README.md
2023-03-12 01:26:32 +02:00
Juraj Bednar
4cdcd39348
Fix a typo in model name ( #16 )
2023-03-11 19:32:20 +02:00
Georgi Gerganov
284d9be2de
Update README.md
2023-03-11 18:10:18 +02:00