Commit Graph

  • cea1c85948 ggml : add ARM_NEON dequantize_row_q4_1() Georgi Gerganov 2023-03-29 22:10:01 +03:00
  • 31887afce7 ggml : add ARM_NEON quantize_row_q4_1() Georgi Gerganov 2023-03-29 22:03:02 +03:00
  • f202ada131 ggml : add ARM_NEON quantize_row_q4_1() Georgi Gerganov 2023-03-29 22:03:02 +03:00
  • fe3f4493ec ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +03:00
  • 3b44d30d9b ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +03:00
  • f5b1f5b676 rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +02:00
  • 61cbfff5c9 rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +02:00
  • 02ddd7f6d9 Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +02:00
  • d9ad104440 Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +02:00
  • 32d84d4876 readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +03:00
  • b467702b87 readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +03:00
  • 689ed6a51e readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +03:00
  • 516d88e75c readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +03:00
  • 39c1b01a04 py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +03:00
  • 53635c081c py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +03:00
  • 462548c4a1 llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +02:00
  • 41318d708e llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +02:00
  • 0f9f0fdabf add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +02:00
  • a6956b25a1 add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +02:00
  • 22ac42c847 Fix GCC warning about binary literal (#595) anzz1 2023-03-29 16:20:07 +03:00
  • 83df5639eb Fix GCC warning about binary literal (#595) anzz1 2023-03-29 16:20:07 +03:00
  • 2b0da79a3a Fix typo in llama.h (#593) anzz1 2023-03-29 16:19:29 +03:00
  • a5c42c4b13 Fix typo in llama.h (#593) anzz1 2023-03-29 16:19:29 +03:00
  • 77f02cd5d0 Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) anzz1 2023-03-28 22:44:29 +03:00
  • 5a5f8b1501 Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) anzz1 2023-03-28 22:44:29 +03:00
  • 056cb367c5 CI: fix subdirectory path globbing (#546) anzz1 2023-03-28 22:43:25 +03:00
  • f1217055ea CI: fix subdirectory path globbing (#546) anzz1 2023-03-28 22:43:25 +03:00
  • 651675b679 llama : fix linkage with mingw (#551) anzz1 2023-03-28 21:23:09 +03:00
  • 7f4c5c6651 llama : fix linkage with mingw (#551) anzz1 2023-03-28 21:23:09 +03:00
  • 2fd21ada5b ggml : add AVX2 implementation of quantize_row_q4_1 (#515) slaren 2023-03-28 20:06:03 +02:00
  • 2a98bc18ea ggml : add AVX2 implementation of quantize_row_q4_1 (#515) slaren 2023-03-28 20:06:03 +02:00
  • 23728d6bd2 py : add temporary script to convert old ggml files to newer version (#539) thement 2023-03-28 19:55:42 +02:00
  • d0aaff571c py : add temporary script to convert old ggml files to newer version (#539) thement 2023-03-28 19:55:42 +02:00
  • 73978d1ad2 py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -04:00
  • d0330fd783 py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -04:00
  • 223cad655e ggml : refactor quantized processing functions (#509) Stephan Walter 2023-03-28 17:13:01 +00:00
  • 99c5b27654 ggml : refactor quantized processing functions (#509) Stephan Walter 2023-03-28 17:13:01 +00:00
  • 412b42ed29 py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +09:00
  • 692ce3164e py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +09:00
  • d4cd9f7004 ci : make ctest verbose, hopefully we see what is wrong with the sanitizer Georgi Gerganov 2023-03-28 20:01:09 +03:00
  • 96f9c0506f ci : make ctest verbose, hopefully we see what is wrong with the sanitizer Georgi Gerganov 2023-03-28 20:01:09 +03:00
  • c4f628288b tests : free llama context at the end of the test Georgi Gerganov 2023-03-28 19:51:55 +03:00
  • d502bc7c9d tests : free llama context at the end of the test Georgi Gerganov 2023-03-28 19:51:55 +03:00
  • 188fb59d88 all : be more strict about converting float to double (#458) Stephan Walter 2023-03-28 16:48:20 +00:00
  • 436e561931 all : be more strict about converting float to double (#458) Stephan Walter 2023-03-28 16:48:20 +00:00
  • a9b8ceaea2 deploy : add a Package.swift for SwiftPM support (#393) Jed Fox 2023-03-28 11:39:01 -05:00
  • 20e1e84884 deploy : add a Package.swift for SwiftPM support (#393) Jed Fox 2023-03-28 11:39:01 -05:00
  • 884f88402f ggml : introduce structs for the q4 data blocks (#356) Stephan Walter 2023-03-28 15:56:03 +00:00
  • c1f885067c ggml : introduce structs for the q4 data blocks (#356) Stephan Walter 2023-03-28 15:56:03 +00:00
  • eba3e4dba3 gitignore : add "embedding" Georgi Gerganov 2023-03-28 18:34:35 +03:00
  • e0670260fb gitignore : add "embedding" Georgi Gerganov 2023-03-28 18:34:35 +03:00
  • 19bf52b793 Check the existence of f16_model_path_base in quantize.py (#574) dotpy314 2023-03-28 23:06:28 +08:00
  • 28ba975aea Check the existence of f16_model_path_base in quantize.py (#574) dotpy314 2023-03-28 23:06:28 +08:00
  • 9ed607fdd5 Fix usage of F16C intrinsics in AVX code (#563) slaren 2023-03-28 16:26:55 +02:00
  • a6bdc47cba Fix usage of F16C intrinsics in AVX code (#563) slaren 2023-03-28 16:26:55 +02:00
  • 68f43a13dc main.cpp fixes, refactoring (#571) anzz1 2023-03-28 17:09:55 +03:00
  • 7b8dbcb78b main.cpp fixes, refactoring (#571) anzz1 2023-03-28 17:09:55 +03:00
  • d7f5b1ac65 Add embedding example to Makefile (#540) RJ Adriaansen 2023-03-28 08:11:09 +02:00
  • 4b8efff0e3 Add embedding example to Makefile (#540) RJ Adriaansen 2023-03-28 08:11:09 +02:00
  • 63d2de599a Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) Marco Matthies 2023-03-27 06:55:26 +02:00
  • 7e5395575a Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) Marco Matthies 2023-03-27 06:55:26 +02:00
  • 2c6eed596e ci: add debug build to sanitizer build matrix (#527) Erik Scholz 2023-03-26 17:48:40 +02:00
  • 34c1072e49 ci: add debug build to sanitizer build matrix (#527) Erik Scholz 2023-03-26 17:48:40 +02:00
  • 180198d957 Fix undefined variables in debug build, remove unused variables (#531) Stephan Walter 2023-03-26 15:34:02 +00:00
  • 939ad2d3a5 Fix undefined variables in debug build, remove unused variables (#531) Stephan Walter 2023-03-26 15:34:02 +00:00
  • 47fc0b82b4 Add support for linux/arm64 platform during Docker Builds (#514) Juan Calderon-Perez 2023-03-26 10:48:42 -04:00
  • 8c2ec5e21d Add support for linux/arm64 platform during Docker Builds (#514) Juan Calderon-Perez 2023-03-26 10:48:42 -04:00
  • 3b8b2c584a Update README and comments for standalone perplexity tool (#525) Stephan Walter 2023-03-26 13:14:01 +00:00
  • b391579db9 Update README and comments for standalone perplexity tool (#525) Stephan Walter 2023-03-26 13:14:01 +00:00
  • a990294c27 [main] fix infinite generation (-n == -1) (#523) anzz1 2023-03-26 16:06:10 +03:00
  • 7a87d31f4f [main] fix infinite generation (-n == -1) (#523) anzz1 2023-03-26 16:06:10 +03:00
  • 3600f1d140 Add logo to README.md Georgi Gerganov 2023-03-26 10:20:49 +03:00
  • 348d6926ee Add logo to README.md Georgi Gerganov 2023-03-26 10:20:49 +03:00
  • 85e558b4ad Exit from interactive mode if input stream is bad (#491) Harald Fernengel 2023-03-26 07:25:46 +02:00
  • 33e35b8fe8 Exit from interactive mode if input stream is bad (#491) Harald Fernengel 2023-03-26 07:25:46 +02:00
  • 5c63c02491 CI: Run other sanitizer builds even if one fails (#511) anzz1 2023-03-26 00:13:28 +02:00
  • 19726169b3 CI: Run other sanitizer builds even if one fails (#511) anzz1 2023-03-26 00:13:28 +02:00
  • 9c2b80f69b Clarify console output in convert-pth-to-ggml.py (#512) jp-x-g 2023-03-25 14:53:55 -07:00
  • f732695cd5 Clarify console output in convert-pth-to-ggml.py (#512) jp-x-g 2023-03-25 14:53:55 -07:00
  • 1ea6448129 CMake / CI additions (#497) anzz1 2023-03-25 23:38:11 +02:00
  • 2f7bf7dd7c CMake / CI additions (#497) anzz1 2023-03-25 23:38:11 +02:00
  • f8eb92869e (Windows) Set console to UTF-8 on init (#420) anzz1 2023-03-25 22:29:22 +02:00
  • 34ab526843 (Windows) Set console to UTF-8 on init (#420) anzz1 2023-03-25 22:29:22 +02:00
  • 2e01c018d2 Fix colors enabling on WIN32 Georgi Gerganov 2023-03-25 21:53:39 +02:00
  • c2b25b6912 Fix colors enabling on WIN32 Georgi Gerganov 2023-03-25 21:53:39 +02:00
  • 9fe0e95688 If n_predict == -1, generate forever Georgi Gerganov 2023-03-25 21:51:41 +02:00
  • 79b2b266db If n_predict == -1, generate forever Georgi Gerganov 2023-03-25 21:51:41 +02:00
  • 310d5d09a3 Inifinite generation via context swapping (#71) Georgi Gerganov 2023-03-25 21:36:22 +02:00
  • e2d490dafd Inifinite generation via context swapping (#71) Georgi Gerganov 2023-03-25 21:36:22 +02:00
  • 3468a153ba Cleanup STL headers + fix embedding examples + minor stuff Georgi Gerganov 2023-03-25 20:51:14 +02:00
  • 03f7e33560 Cleanup STL headers + fix embedding examples + minor stuff Georgi Gerganov 2023-03-25 20:51:14 +02:00
  • 9d678e17dc Move chat scripts into "./examples" Georgi Gerganov 2023-03-25 20:36:52 +02:00
  • 55ad42af84 Move chat scripts into "./examples" Georgi Gerganov 2023-03-25 20:36:52 +02:00
  • 4b720d5b92 Add AVX2 implementation of dequantize_row_q4_1 (#505) slaren 2023-03-25 19:31:48 +01:00
  • 459e93cce0 Add AVX2 implementation of dequantize_row_q4_1 (#505) slaren 2023-03-25 19:31:48 +01:00
  • 84db7c0b8f Overhaul the examples structure Georgi Gerganov 2023-03-25 20:26:40 +02:00
  • a316a425d0 Overhaul the examples structure Georgi Gerganov 2023-03-25 20:26:40 +02:00
  • 56e7297bbd Retire the ggml_mul_mat() branch for transposed src0 (#500) Georgi Gerganov 2023-03-25 19:47:21 +02:00
  • ecbe466a36 Retire the ggml_mul_mat() branch for transposed src0 (#500) Georgi Gerganov 2023-03-25 19:47:21 +02:00
  • d2336726ee Disable prompt verbosity by default and add option to enable (#480) Georgi Gerganov 2023-03-25 17:16:50 +02:00