Commit Graph

260 Commits

Author SHA1 Message Date
Slaren
4ccd1fa7b4 Make mmap_file static 2023-03-30 12:28:25 -07:00
Slaren
ee1eb8aab0 Fix ggml_init_params in quantize 2023-03-30 12:28:25 -07:00
Slaren
4608b1ee54 Add mmap support for model files 2023-03-30 12:28:25 -07:00
Stephan Walter
f27e27c590 cmake : properly invoke CTest (#629) 2023-03-30 20:56:59 +03:00
Casey Primozic
3b78ca3c81 Remove unused variable (#607)
* It seems some new warning were added recently that exposed this.  I wrote the code that included this unused variable originally and it is indeed not needed.
2023-03-30 17:53:35 +00:00
david raistrick
dc396d9386 make : fix darwin f16c flags check (#615)
...there was no check.  ported upstream from https://github.com/zanussbaum/gpt4all.cpp/pull/2 (I dont see any clean path for upstream patches)
2023-03-30 20:34:45 +03:00
Georgi Gerganov
46bc56c86e ggml : fix NEON signs (close #620, #622) 2023-03-30 20:27:32 +03:00
slaren
c7a5aebde4 Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) 2023-03-30 11:16:30 +02:00
anzz1
357f21576e ci : re-enable AVX512 testing (Windows-MSVC) (#584)
* CI: Re-enable AVX512 testing (Windows-MSVC)

Now with 100% less base64 encoding

* plain __cpuid is enough here
2023-03-29 23:44:39 +03:00
Georgi Gerganov
7639a7c89c ggml : init time on first ggml_init() call 2023-03-29 22:15:34 +03:00
Georgi Gerganov
ed1554989a llama : fix compile warnings when reading the vocab 2023-03-29 22:13:12 +03:00
Georgi Gerganov
169c724830 ggml : add ARM_NEON dequantize_row_q4_1() 2023-03-29 22:10:01 +03:00
Georgi Gerganov
31887afce7 ggml : add ARM_NEON quantize_row_q4_1() 2023-03-29 22:03:07 +03:00
Georgi Gerganov
fe3f4493ec ggml : add ARM_NEON ggml_vec_dot_q4_1() 2023-03-29 22:03:07 +03:00
Pavol Rusnak
f5b1f5b676 rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600)
to match filenames of other converters
2023-03-29 20:09:25 +02:00
Thérence
02ddd7f6d9 Create chat-13B.bat (#592)
* Create chat-13B.bat

Same script than chat-13B.sh, but for windows users.
Tested and working on windows 10/11 v 22H2

* Apply suggestions from code review

---------

Co-authored-by: anzz1 <anzz1@live.com>
2023-03-29 20:21:09 +03:00
Georgi Gerganov
32d84d4876 readme : fix typos 2023-03-29 19:38:31 +03:00
Georgi Gerganov
689ed6a51e readme : add GPT4All instructions (close #588) 2023-03-29 19:37:20 +03:00
Georgi Gerganov
39c1b01a04 py : add GPT4All conversion script
For now: copy-paste
Too much time for me to deduplicate the python code
2023-03-29 19:29:52 +03:00
Maël Kerbiriou
462548c4a1 llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) 2023-03-29 19:10:07 +03:00
Tobias Lütke
0f9f0fdabf add example of re-act pattern (#583)
* add example of re-act pattern

* spelling...

* fixed whitespace in reverse prompt issue
2023-03-29 10:10:24 -05:00
anzz1
22ac42c847 Fix GCC warning about binary literal (#595)
0b10101010 -> 0xAA /* 0b10101010 */
2023-03-29 13:20:07 +00:00
anzz1
2b0da79a3a Fix typo in llama.h (#593) 2023-03-29 13:19:29 +00:00
anzz1
77f02cd5d0 Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)
* Enable Fused-Multiply-Add (FMA) instructions on MSVC

__FMA__ macro does not exist in MSVC

* Enable F16C/CVT16 vector extensions on MSVC

__F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512

* MSVC cvt intrinsics

* Add __SSE3__ macro for MSVC too because why not

even though it's not currently used for anything when AVX is defined
2023-03-28 22:44:29 +03:00
anzz1
056cb367c5 CI: fix subdirectory path globbing (#546)
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
2023-03-28 22:43:25 +03:00
anzz1
651675b679 llama : fix linkage with mingw (#551)
* Revert 7e53955 (#542)

Still needs to be fixed properly

* Fix linking on mingw32
2023-03-28 21:23:09 +03:00
slaren
2fd21ada5b ggml : add AVX2 implementation of quantize_row_q4_1 (#515)
* Add AVX2 implementation of quantize_row_q4_1

* Actually use AVX2

* Make quantize_row_q4_1 static

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 21:06:03 +03:00
thement
23728d6bd2 py : add temporary script to convert old ggml files to newer version (#539)
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-28 20:55:42 +03:00
Tai Duc Nguyen
73978d1ad2 py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) 2023-03-28 20:51:29 +03:00
Stephan Walter
223cad655e ggml : refactor quantized processing functions (#509)
* Refactor quantized processing functions

* ggml : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 20:13:01 +03:00
DooWoong Lee (David)
412b42ed29 py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) 2023-03-28 20:02:34 +03:00
Georgi Gerganov
d4cd9f7004 ci : make ctest verbose, hopefully we see what is wrong with the sanitizer 2023-03-28 20:01:09 +03:00
Georgi Gerganov
c4f628288b tests : free llama context at the end of the test 2023-03-28 19:51:55 +03:00
Stephan Walter
188fb59d88 all : be more strict about converting float to double (#458)
* Be more strict about converting float to double

* Test equivalence of round, SILU implementations

Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.

* Fix softmax in perplexity.cpp

* all : prefer float over double where appropriate

* perplexity : add <cmath>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 19:48:20 +03:00
Jed Fox
a9b8ceaea2 deploy : add a Package.swift for SwiftPM support (#393)
* Add a Package.swift for SwiftPM support

* Swap from exclusions to allowlist
2023-03-28 19:39:01 +03:00
Stephan Walter
884f88402f ggml : introduce structs for the q4 data blocks (#356)
* Introduce structs for the q4 data blocks

* ggml : rename quant struct variables + fix ARM_NEON

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 18:56:03 +03:00
Georgi Gerganov
eba3e4dba3 gitignore : add "embedding" 2023-03-28 18:34:35 +03:00
dotpy314
19bf52b793 Check the existence of f16_model_path_base in quantize.py (#574)
Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>
2023-03-28 18:06:28 +03:00
slaren
9ed607fdd5 Fix usage of F16C intrinsics in AVX code (#563)
* Fix usage of F16C intrinsics in AVX code when F16C is not defined
2023-03-28 17:26:55 +03:00
anzz1
68f43a13dc main.cpp fixes, refactoring (#571)
- main: entering empty line passes back control without new input in interactive/instruct modes
- instruct mode: keep prompt fix
- instruct mode: duplicate instruct prompt fix
- refactor: move common console code from main->common
2023-03-28 17:09:55 +03:00
RJ Adriaansen
d7f5b1ac65 Add embedding example to Makefile (#540) 2023-03-28 09:11:09 +03:00
Marco Matthies
63d2de599a Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) 2023-03-27 07:55:26 +03:00
Erik Scholz
2c6eed596e ci: add debug build to sanitizer build matrix (#527) 2023-03-26 15:48:40 +00:00
Stephan Walter
180198d957 Fix undefined variables in debug build, remove unused variables (#531) 2023-03-26 15:34:02 +00:00
Juan Calderon-Perez
47fc0b82b4 Add support for linux/arm64 platform during Docker Builds (#514)
* Add support for linux/arm64 platform

* Add platform to versioned builds
2023-03-26 14:48:42 +00:00
Stephan Walter
3b8b2c584a Update README and comments for standalone perplexity tool (#525) 2023-03-26 16:14:01 +03:00
anzz1
a990294c27 [main] fix infinite generation (-n == -1) (#523) 2023-03-26 16:06:10 +03:00
Georgi Gerganov
3600f1d140 Add logo to README.md 2023-03-26 10:20:49 +03:00
Harald Fernengel
85e558b4ad Exit from interactive mode if input stream is bad (#491)
Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z
on Windows.
2023-03-26 08:25:46 +03:00
anzz1
5c63c02491 CI: Run other sanitizer builds even if one fails (#511)
applies only to sanitizer builds so they wont be cancelled
2023-03-26 00:13:28 +02:00