Commit Graph

  • 502a400192 Disable prompt verbosity by default and add option to enable (#480) Georgi Gerganov 2023-03-25 17:16:50 +02:00
  • 432b98793c Add AVX2 implementation of dequantize_row_q4_0 (#467) slaren 2023-03-25 16:06:49 +01:00
  • 09aecbf628 Add AVX2 implementation of dequantize_row_q4_0 (#467) slaren 2023-03-25 16:06:49 +01:00
  • 9f8548b2d5 Don't interefe with BLAS for large prompts by running only 1 thread Georgi Gerganov 2023-03-25 17:03:10 +02:00
  • 4640eff23d Don't interefe with BLAS for large prompts by running only 1 thread Georgi Gerganov 2023-03-25 17:03:10 +02:00
  • f6a2b1fc20 Add longer DAN prompt for testing big batch numbers Georgi Gerganov 2023-03-25 16:47:59 +02:00
  • ab77d76312 Add longer DAN prompt for testing big batch numbers Georgi Gerganov 2023-03-25 16:47:59 +02:00
  • e66804f2d7 Add timings for the prompt evaluation (#478) slaren 2023-03-25 15:34:23 +01:00
  • 29b7baab67 Add timings for the prompt evaluation (#478) slaren 2023-03-25 15:34:23 +01:00
  • 1c1459f073 Remove obsolete information from README Georgi Gerganov 2023-03-25 16:30:32 +02:00
  • 4a7129acd2 Remove obsolete information from README Georgi Gerganov 2023-03-25 16:30:32 +02:00
  • 39ab880ccd Remove obsolete assert and fix compiler warning Georgi Gerganov 2023-03-25 16:22:05 +02:00
  • 6b6dbc8910 Remove obsolete assert and fix compiler warning Georgi Gerganov 2023-03-25 16:22:05 +02:00
  • 0bbf9a17e7 Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS Georgi Gerganov 2023-03-25 16:09:54 +02:00
  • 2a2e63ce05 Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS Georgi Gerganov 2023-03-25 16:09:54 +02:00
  • f60b207880 bounds checking for input prefix (#492) anzz1 2023-03-25 14:42:09 +02:00
  • e899bf54b2 bounds checking for input prefix (#492) anzz1 2023-03-25 14:42:09 +02:00
  • e0522e5dd3 feat: '--in-prefix STRING' option (#426) anzz1 2023-03-25 14:03:19 +02:00
  • fbd4d38c64 feat: '--in-prefix STRING' option (#426) anzz1 2023-03-25 14:03:19 +02:00
  • 3261abc446 Add support for file load progress reporting callbacks (#434) Jed Fox 2023-03-25 01:26:28 -04:00
  • 58e6c9f36f Add support for file load progress reporting callbacks (#434) Jed Fox 2023-03-25 01:26:28 -04:00
  • 27d29a069f Add missing struct annotation (#483) Doomsdayrs 2023-03-25 01:21:24 -04:00
  • 36d07532ef Add missing struct annotation (#483) Doomsdayrs 2023-03-25 01:21:24 -04:00
  • 9ba873f48c Fix crash for 65B model with pre-allocated memory (#485) Chris Kuehl 2023-03-24 23:38:14 -05:00
  • 6f1ee4b640 Fix crash for 65B model with pre-allocated memory (#485) Chris Kuehl 2023-03-24 23:38:14 -05:00
  • 0965918677 Disable BLAS altogether - the bug is not just for qunatized mat mul Georgi Gerganov 2023-03-24 23:47:06 +02:00
  • 8520fc310e Disable BLAS altogether - the bug is not just for qunatized mat mul Georgi Gerganov 2023-03-24 23:47:06 +02:00
  • 76e580d933 Disable BLAS branch in mul_mat - seems there is a bug Georgi Gerganov 2023-03-24 23:39:17 +02:00
  • b3f460e941 Disable BLAS branch in mul_mat - seems there is a bug Georgi Gerganov 2023-03-24 23:39:17 +02:00
  • ba186f7f64 Immediately start processing the prompt before user input has been provided (#476) Georgi Gerganov 2023-03-24 23:17:58 +02:00
  • 04c6f5ed6f Immediately start processing the prompt before user input has been provided (#476) Georgi Gerganov 2023-03-24 23:17:58 +02:00
  • 92dc17b275 Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +02:00
  • 7a9b6c3a8b Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +02:00
  • a1a48cfccb Temporary bump the memory buffer size - hopefully fix issues from 483bab2e Georgi Gerganov 2023-03-24 18:23:56 +02:00
  • 31572d9665 Temporary bump the memory buffer size - hopefully fix issues from 483bab2e Georgi Gerganov 2023-03-24 18:23:56 +02:00
  • ccf5a1b08d Update README.md (#444) Gary Mulder 2023-03-24 15:23:09 +00:00
  • f4f5362edb Update README.md (#444) Gary Mulder 2023-03-24 15:23:09 +00:00
  • 7743aa368c fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -05:00
  • 863f65e2e3 fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -05:00
  • 581994aef0 Properly free llama_context on failure Georgi Gerganov 2023-03-24 17:21:01 +02:00
  • afd220d9c6 Properly free llama_context on failure Georgi Gerganov 2023-03-24 17:21:01 +02:00
  • 5571dc71c4 additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -07:00
  • 481044d50c additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -07:00
  • d86b7f08ad Support calling mlock() on loaded model data on Linux and macOS (#453) comex 2023-03-24 08:19:05 -07:00
  • 563cdc391d Support calling mlock() on loaded model data on Linux and macOS (#453) comex 2023-03-24 08:19:05 -07:00
  • 605a3aaef3 Add embedding mode with arg flag. Currently working (#282) Luciano 2023-03-24 08:05:13 -07:00
  • 8d4a855c24 Add embedding mode with arg flag. Currently working (#282) Luciano 2023-03-24 08:05:13 -07:00
  • 1f369c619d Add link to Roadmap discussion Georgi Gerganov 2023-03-24 09:13:35 +02:00
  • b6b268d441 Add link to Roadmap discussion Georgi Gerganov 2023-03-24 09:13:35 +02:00
  • 681cbacbe1 Revert "Fix memory allocation issues and seg faults" Georgi Gerganov 2023-03-24 06:22:28 +02:00
  • 3cd8dde0d1 Revert "Fix memory allocation issues and seg faults" Georgi Gerganov 2023-03-24 06:22:28 +02:00
  • 3d8185edc9 Fix memory allocation issues and seg faults Georgi Gerganov 2023-03-24 00:11:53 +02:00
  • 4870e455b3 Fix memory allocation issues and seg faults Georgi Gerganov 2023-03-24 00:11:53 +02:00
  • 370c9ecb96 Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) Georgi Gerganov 2023-03-23 23:22:01 +02:00
  • 483bab2e3d Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) Georgi Gerganov 2023-03-23 23:22:01 +02:00
  • d89d84ac0f Fix quantize script not finding models in parent directory (#428) Jed Fox 2023-03-23 16:42:52 -04:00
  • 404e1da38e Fix quantize script not finding models in parent directory (#428) Jed Fox 2023-03-23 16:42:52 -04:00
  • 20c3c59bd4 Remove oboslete command from Docker script Georgi Gerganov 2023-03-23 22:39:44 +02:00
  • 4cc053b6d5 Remove oboslete command from Docker script Georgi Gerganov 2023-03-23 22:39:44 +02:00
  • 0e661616e2 Obsolete Georgi Gerganov 2023-03-23 22:32:02 +02:00
  • 0ba5a3a9a5 Obsolete Georgi Gerganov 2023-03-23 22:32:02 +02:00
  • 8faa6c7718 Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) rabidcopy 2023-03-23 15:22:47 -05:00
  • 2e17dfd80a Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) rabidcopy 2023-03-23 15:22:47 -05:00
  • 9c5d5c52ce Fix GPTQ converter (#423) Timmy Knight 2023-03-23 10:18:13 -10:00
  • 20a1a4e09c Fix GPTQ converter (#423) Timmy Knight 2023-03-23 10:18:13 -10:00
  • fd1312648c Generate library with CMake (#430) nusu-github 2023-03-24 05:16:48 +09:00
  • ad072fc5ad Generate library with CMake (#430) nusu-github 2023-03-24 05:16:48 +09:00
  • 662adbfdb6 Command line args bounds checking (#424) anzz1 2023-03-23 19:54:28 +02:00
  • ea10d3ded2 Command line args bounds checking (#424) anzz1 2023-03-23 19:54:28 +02:00
  • bf0302d463 Fix Nix build Ben Siraphob 2023-03-22 00:37:02 -05:00
  • a18c19259a Fix Nix build Ben Siraphob 2023-03-22 00:37:02 -05:00
  • 3ebb023fb2 Revert "Delete SHA256SUMS for now" (#429) Stephan Walter 2023-03-23 14:15:48 +00:00
  • a50e39c6fe Revert "Delete SHA256SUMS for now" (#429) Stephan Walter 2023-03-23 14:15:48 +00:00
  • 455fffe547 Fix Makefile echo escape codes (by removing them). (#418) Kerfuffle 2023-03-23 05:41:32 -06:00
  • a140219e81 Fix Makefile echo escape codes (by removing them). (#418) Kerfuffle 2023-03-23 05:41:32 -06:00
  • e689dccbad Move model section from issue template to README.md (#421) Gary Mulder 2023-03-23 11:30:40 +00:00
  • 8a3e5ef801 Move model section from issue template to README.md (#421) Gary Mulder 2023-03-23 11:30:40 +00:00
  • 0c2b820e64 Delete SHA256SUMS for now (#416) anzz1 2023-03-23 12:26:19 +02:00
  • 8eea5ae0e5 Delete SHA256SUMS for now (#416) anzz1 2023-03-23 12:26:19 +02:00
  • a1b7fa8c60 Adjust repetition penalty .. Georgi Gerganov 2023-03-23 10:46:58 +02:00
  • 93208cfb92 Adjust repetition penalty .. Georgi Gerganov 2023-03-23 10:46:58 +02:00
  • 1d31d737d8 Add link to recent podcast about whisper.cpp and llama.cpp Georgi Gerganov 2023-03-23 09:48:51 +02:00
  • 03ace14cfd Add link to recent podcast about whisper.cpp and llama.cpp Georgi Gerganov 2023-03-23 09:48:51 +02:00
  • 6eddca75b1 CI: CMake: Separate build and test steps (#376) anzz1 2023-03-23 04:20:34 +02:00
  • e4412b45e3 CI: CMake: Separate build and test steps (#376) anzz1 2023-03-23 04:20:34 +02:00
  • 1b4b61fb60 Fix instruct mode broken by PR #354 (#409) tjohnman 2023-03-23 01:30:23 +01:00
  • f7dc43bc0d Fix instruct mode broken by PR #354 (#409) tjohnman 2023-03-23 01:30:23 +01:00
  • ffdeece7c2 Update issue template so people will use it (#404) Gary Mulder 2023-03-22 19:06:18 +00:00
  • ee8a788786 Update issue template so people will use it (#404) Gary Mulder 2023-03-22 19:06:18 +00:00
  • 43a021a260 Deduplicate q4 quantization functions (#383) Stephan Walter 2023-03-22 17:29:06 +00:00
  • 69c92298a9 Deduplicate q4 quantization functions (#383) Stephan Walter 2023-03-22 17:29:06 +00:00
  • f520f9be86 fix: add POSIX functionality for Linux compilation (#51) Valentyn Bezshapkin 2023-03-22 18:20:25 +01:00
  • 97940520e8 fix: add POSIX functionality for Linux compilation (#51) Valentyn Bezshapkin 2023-03-22 18:20:25 +01:00
  • 815b60c690 Don't force immediate interactive without -i (#354) tjohnman 2023-03-22 18:16:35 +01:00
  • 305ba6f0e6 Don't force immediate interactive without -i (#354) tjohnman 2023-03-22 18:16:35 +01:00
  • bc091a84e5 cmake: make llama an actual library (#392) Erik Scholz 2023-03-22 17:37:10 +01:00
  • 4122dffff9 cmake: make llama an actual library (#392) Erik Scholz 2023-03-22 17:37:10 +01:00
  • 48c8ad5bcf fix perplexity after c-api refactor (#390) Erik Scholz 2023-03-22 17:09:38 +01:00
  • 56e659a0b2 fix perplexity after c-api refactor (#390) Erik Scholz 2023-03-22 17:09:38 +01:00
  • 686427a35f Add details on perplexity to README.md (#395) Gary Linscott 2023-03-22 08:53:54 -07:00