Commit Graph

  • c71609ec0d llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
  • 16bc66d947 llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
  • eee144f02a ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
  • 0512d66670 ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
  • fd58cf94b3 train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
  • 0e76a8992c train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
  • 4a18131af5 gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
  • 2db94d98ed gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
  • 013380a852 gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
  • ecf90b1a51 gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
  • 9c2ff9b6b8 ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
  • 2619109ad5 ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
  • b87b76df01 llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
  • ec893798b7 llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
  • 478aeb62b9 docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
  • 45855b3f1c docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
  • 6580c05d1c readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
  • 4aea3b846e readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
  • fa27d2f82a ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
  • da0400344b ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
  • 1dcfc8d0cf convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
  • e519621010 convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
  • 6b9fa77bfe make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
  • ac43576124 make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
  • 6c09dbc451 gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
  • 20c7e1e804 gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
  • 0c868ebe57 metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
  • dc6897404e metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
  • 31e3b674ad build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
  • 527e57cfd8 build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
  • 9d92d67428 readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
  • ffe88a36a9 readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
  • d9749e7cc3 cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
  • 99115f3fa6 cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
  • be8fb3dc9b docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
  • 1726f9626f docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
  • 17a015ab3b nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
  • a98b1633d5 nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
  • a71167dedf llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
  • c091cdfb24 llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
  • a2fe58d235 examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
  • 51a7cf5c6e examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
  • 13ec5a1de9 scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
  • bedb92b603 scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
  • 1e8ebda8ce Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
  • bc9d3e3971 Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
  • 1f4f0754e3 ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
  • 36b904e200 ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
  • 2e75231cb3 zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
  • 324f3403d5 zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
  • 8b1dfa6af7 embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
  • f56c418ab0 embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
  • 2c58f6bb1b CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
  • 8185710a80 CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
  • 7eca40bf4b readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
  • 7eb41179ed readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
  • d6576ed07f llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
  • a5661d7e71 llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
  • 70993d7bbd benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
  • 65c2c1c5ab benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
  • eeee397010 flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
  • 80834daecf flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
  • b68da9373d CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
  • a40f2b656f CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
  • 7c5c6df732 examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
  • d119c04c15 examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
  • 78ff726016 make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
  • 8781013ef6 make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
  • d997ba652f ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
  • 7ddf185537 ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
  • 8821f4ae78 CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
  • ee66942d7e CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
  • 94a0ea6e76 CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
  • 111163e246 CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
  • 112bdc67c5 llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
  • 8b428c9bc8 llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
  • 64dbee2c5e CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • 578d8c8f5c CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • 36be4e8ee8 Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • b541b4f0b1 Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • aa4418d795 Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • 5dbc2b3213 Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • 39897d794c Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • b08e75baea Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • 481ac7803f examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • e6616cf0db examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • 4e89732b50 check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 3aefaab9e5 check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 217da58978 fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 69eb67e282 fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 7d434a09c5 llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • 4fe09dfe66 llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • cf213d4e5b common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • 80291a1d02 common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • 303a4f7baa metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • c6f1491da0 metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • a78476e865 convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • e3d87a6c36 convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • a76ba4384c sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
  • 8c00b7a6ff sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00