Commit Graph

  • 193c737cb1 build : do not use _GNU_SOURCE gratuitously (#2035) Przemysław Pawełczyk 2023-09-08 14:09:21 +02:00
  • cb6c44c5e0 build : do not use _GNU_SOURCE gratuitously (#2035) Przemysław Pawełczyk 2023-09-08 14:09:21 +02:00
  • 3921042df8 docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044) hongbo.mo 2023-09-08 18:57:55 +08:00
  • a21baeb122 docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044) hongbo.mo 2023-09-08 18:57:55 +08:00
  • b897d9e7a6 Update deprecated GGML TheBloke links to GGUF (#3079) Yui 2023-09-08 12:32:55 +02:00
  • 6ff712a6d1 Update deprecated GGML TheBloke links to GGUF (#3079) Yui 2023-09-08 12:32:55 +02:00
  • 68b992e089 ggml-alloc : correctly check mmap return value for errors (#3075) slaren 2023-09-08 04:04:56 +02:00
  • ebc96086af ggml-alloc : correctly check mmap return value for errors (#3075) slaren 2023-09-08 04:04:56 +02:00
  • d2db7d97c6 enable CPU HBM (#2603) Kunshang Ji 2023-09-08 09:46:56 +08:00
  • 7f412dab9c enable CPU HBM (#2603) Kunshang Ji 2023-09-08 09:46:56 +08:00
  • 923352baf1 convert : fix F32 ftype not being saved (#3048) Cebtenzzre 2023-09-07 14:27:42 -04:00
  • 6336d834ec convert : fix F32 ftype not being saved (#3048) Cebtenzzre 2023-09-07 14:27:42 -04:00
  • bd7504dd6e fix some warnings from gcc and clang-tidy (#3038) Cebtenzzre 2023-09-07 13:22:29 -04:00
  • 00d62adb79 fix some warnings from gcc and clang-tidy (#3038) Cebtenzzre 2023-09-07 13:22:29 -04:00
  • b73063bcda make : improve test target (#3031) Cebtenzzre 2023-09-07 10:15:01 -04:00
  • 4fa2cc1750 make : improve test target (#3031) Cebtenzzre 2023-09-07 10:15:01 -04:00
  • a6bc690a46 make : fix CPPFLAGS (#3035) Cebtenzzre 2023-09-07 10:13:50 -04:00
  • 5ffab089a5 make : fix CPPFLAGS (#3035) Cebtenzzre 2023-09-07 10:13:50 -04:00
  • 22b1b7f7a7 llama-bench : use two tokens in the warmup run for prompt evals (#3059) slaren 2023-09-07 15:52:34 +02:00
  • 15b67a66c2 llama-bench : use two tokens in the warmup run for prompt evals (#3059) slaren 2023-09-07 15:52:34 +02:00
  • a53491815d metal : parallel RoPE on Metal (#3024) Kawrakow 2023-09-07 15:45:01 +02:00
  • be8c9c245b metal : parallel RoPE on Metal (#3024) Kawrakow 2023-09-07 15:45:01 +02:00
  • 557ae758dd metal : correct fix of kernel_norm (#3060) Kawrakow 2023-09-07 15:42:42 +02:00
  • be6beeb8d7 metal : correct fix of kernel_norm (#3060) Kawrakow 2023-09-07 15:42:42 +02:00
  • f8da188258 metal : fix kernel_norm (fixes Falcon on Metal) (#3057) Georgi Gerganov 2023-09-07 15:49:09 +03:00
  • c4f496648c metal : fix kernel_norm (fixes Falcon on Metal) (#3057) Georgi Gerganov 2023-09-07 15:49:09 +03:00
  • ce6bb57378 ggml : posixify madvise and pagesize (#3037) Przemysław Pawełczyk 2023-09-07 10:15:06 +02:00
  • fec2fb19e4 ggml : posixify madvise and pagesize (#3037) Przemysław Pawełczyk 2023-09-07 10:15:06 +02:00
  • bf0b4c808d k-quants : fix zero-weight guard in Q6_K (ref #3040) Georgi Gerganov 2023-09-06 12:40:57 +03:00
  • 178b1850eb k-quants : fix zero-weight guard in Q6_K (ref #3040) Georgi Gerganov 2023-09-06 12:40:57 +03:00
  • be4f496d09 convert-llama-ggml-to-gguf: Try to handle files older than GGJTv3 (#3023) Kerfuffle 2023-09-06 02:49:11 -06:00
  • ea2c85d5d2 convert-llama-ggml-to-gguf: Try to handle files older than GGJTv3 (#3023) Kerfuffle 2023-09-06 02:49:11 -06:00
  • a76a94fe31 build : add LLAMA_METAL_NDEBUG flag (#3033) Cebtenzzre 2023-09-05 18:21:10 -04:00
  • 9912b9efc8 build : add LLAMA_METAL_NDEBUG flag (#3033) Cebtenzzre 2023-09-05 18:21:10 -04:00
  • 2ebfd0aa22 make : use new flag variables for recent changes (#3019) Cebtenzzre 2023-09-05 15:12:00 -04:00
  • 9e2023156e make : use new flag variables for recent changes (#3019) Cebtenzzre 2023-09-05 15:12:00 -04:00
  • ed5a405c22 examples : replace fprintf to stdout with printf (#3017) Cebtenzzre 2023-09-05 15:10:27 -04:00
  • de2fe892af examples : replace fprintf to stdout with printf (#3017) Cebtenzzre 2023-09-05 15:10:27 -04:00
  • c9e735f4bf convert: fix convert.py not working with int filename_stem (#3028) Erik Scholz 2023-09-05 19:41:00 +02:00
  • c9c3220c48 convert: fix convert.py not working with int filename_stem (#3028) Erik Scholz 2023-09-05 19:41:00 +02:00
  • 4f7048458f Guard against all weights in a super-block being zero (#3010) Kawrakow 2023-09-05 09:55:33 +02:00
  • d59bd97065 Guard against all weights in a super-block being zero (#3010) Kawrakow 2023-09-05 09:55:33 +02:00
  • 365578f31e llama : update logic for number of threads when using BLAS Georgi Gerganov 2023-09-05 10:46:39 +03:00
  • 35938ee3b0 llama : update logic for number of threads when using BLAS Georgi Gerganov 2023-09-05 10:46:39 +03:00
  • 9615d0c6b4 speculative : add grammar support (#2991) Georgi Gerganov 2023-09-05 08:46:17 +03:00
  • 921772104b speculative : add grammar support (#2991) Georgi Gerganov 2023-09-05 08:46:17 +03:00
  • 5ce628ba1c py : minor Georgi Gerganov 2023-09-04 22:50:50 +03:00
  • 2ba85c8609 py : minor Georgi Gerganov 2023-09-04 22:50:50 +03:00
  • 8e49675a7b build : on Mac OS enable Metal by default (#2901) Georgi Gerganov 2023-09-04 22:26:24 +03:00
  • e36ecdccc8 build : on Mac OS enable Metal by default (#2901) Georgi Gerganov 2023-09-04 22:26:24 +03:00
  • 8d85c7d12c ggml-opencl : store GPU buffer in ggml_tensor::extra (#2994) slaren 2023-09-04 14:59:52 +02:00
  • bd33e5ab92 ggml-opencl : store GPU buffer in ggml_tensor::extra (#2994) slaren 2023-09-04 14:59:52 +02:00
  • 24d2622ba2 llama-bench : make cpp file non-executable (#2999) Cebtenzzre 2023-09-04 06:40:18 -04:00
  • 3103568144 llama-bench : make cpp file non-executable (#2999) Cebtenzzre 2023-09-04 06:40:18 -04:00
  • 736e898675 make : add speculative example (#3003) Leng Yue 2023-09-04 03:39:57 -07:00
  • 5b8530d88c make : add speculative example (#3003) Leng Yue 2023-09-04 03:39:57 -07:00
  • d48f5c09df server : add a subtle loading animation to the edit box (#2466) Aarni Koskela 2023-09-04 10:28:55 +02:00
  • e4386f417f server : add a subtle loading animation to the edit box (#2466) Aarni Koskela 2023-09-04 10:28:55 +02:00
  • b0cd5d83b3 2x faster (rms) norm cuda kernels (3.7% e2e improvement) (#2985) Jiahao Li 2023-09-04 14:53:30 +08:00
  • 35195689cd 2x faster (rms) norm cuda kernels (3.7% e2e improvement) (#2985) Jiahao Li 2023-09-04 14:53:30 +08:00
  • 822ad0f739 ggml-alloc : use virtual memory for measurement (#2973) slaren 2023-09-03 20:34:09 +02:00
  • cf9b08485c ggml-alloc : use virtual memory for measurement (#2973) slaren 2023-09-03 20:34:09 +02:00
  • a88f9a8ca8 speculative : PoC for speeding-up inference via speculative sampling (#2926) Georgi Gerganov 2023-09-03 15:12:08 +03:00
  • 47068e5170 speculative : PoC for speeding-up inference via speculative sampling (#2926) Georgi Gerganov 2023-09-03 15:12:08 +03:00
  • 9fe76e79e3 perplexity : fix ETA by warming up the model with an empty run Georgi Gerganov 2023-09-03 13:42:56 +03:00
  • 8f429fa511 perplexity : fix ETA by warming up the model with an empty run Georgi Gerganov 2023-09-03 13:42:56 +03:00
  • ae8e8ebe53 gguf(python): Fix special vocab handling when id < 0 (#2984) Kerfuffle 2023-09-03 04:38:43 -06:00
  • 6519e9c99c gguf(python): Fix special vocab handling when id < 0 (#2984) Kerfuffle 2023-09-03 04:38:43 -06:00
  • 811285a543 metal : restore 363f0bf and fix reduce in F16_F32 kernels (#2986) Georgi Gerganov 2023-09-03 13:23:33 +03:00
  • b7f2aa9e51 metal : restore 363f0bf and fix reduce in F16_F32 kernels (#2986) Georgi Gerganov 2023-09-03 13:23:33 +03:00
  • 928ab515d1 cov : disable comment in PRs (#2989) Alon 2023-09-03 13:19:01 +03:00
  • 73a12a6344 cov : disable comment in PRs (#2989) Alon 2023-09-03 13:19:01 +03:00
  • 7fe3095287 llama : fix bpe tokenize from byte (#2889) opparco 2023-09-03 19:18:09 +09:00
  • 3730134776 llama : fix bpe tokenize from byte (#2889) opparco 2023-09-03 19:18:09 +09:00
  • 89349ceb7b metal : revert 6af0bab until we fix it Georgi Gerganov 2023-09-03 12:40:56 +03:00
  • d9151e6f57 metal : revert 6af0bab until we fix it Georgi Gerganov 2023-09-03 12:40:56 +03:00
  • 53afc99c41 cov : add Code Coverage and codecov.io integration (#2928) Alon 2023-09-03 11:48:49 +03:00
  • afc43d5f82 cov : add Code Coverage and codecov.io integration (#2928) Alon 2023-09-03 11:48:49 +03:00
  • b3912e82f1 opencl : fix a bug in ggml_cl_pool_malloc() for ggml_cl_mul_mat_f32() (#2955) Wentai Zhang 2023-09-03 16:46:44 +08:00
  • 6460f758db opencl : fix a bug in ggml_cl_pool_malloc() for ggml_cl_mul_mat_f32() (#2955) Wentai Zhang 2023-09-03 16:46:44 +08:00
  • c937b6d718 metal : more optimizations (#2959) Kawrakow 2023-09-03 11:06:22 +03:00
  • ca82cf7bac metal : more optimizations (#2959) Kawrakow 2023-09-03 11:06:22 +03:00
  • c2873512e6 swift : add support for k-quants (#2983) kchro3 2023-09-02 23:21:05 -07:00
  • 6a31a3bd98 swift : add support for k-quants (#2983) kchro3 2023-09-02 23:21:05 -07:00
  • ba445e659c convert.py : BPE fixes (#2938) Kerfuffle 2023-09-02 23:52:13 -06:00
  • cff7b0bf07 convert.py : BPE fixes (#2938) Kerfuffle 2023-09-02 23:52:13 -06:00
  • a8b85ea614 docs : add catai to README.md (#2967) Ido S 2023-09-03 08:50:51 +03:00
  • 340af42f09 docs : add catai to README.md (#2967) Ido S 2023-09-03 08:50:51 +03:00
  • b41680f397 examples : fix gpt-neox (#2943) momonga 2023-09-03 14:36:28 +09:00
  • c42f0ec6b3 examples : fix gpt-neox (#2943) momonga 2023-09-03 14:36:28 +09:00
  • f96a0722fa swift : add missing c file to Package.swift (#2978) kchro3 2023-09-02 22:27:25 -07:00
  • 2753415afd swift : add missing c file to Package.swift (#2978) kchro3 2023-09-02 22:27:25 -07:00
  • af0127b31a make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS (#2886) Cebtenzzre 2023-09-03 01:26:59 -04:00
  • bc054af97a make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS (#2886) Cebtenzzre 2023-09-03 01:26:59 -04:00
  • 9f664f66a4 logging: Fix creating empty file even when disabled (#2966) Kerfuffle 2023-09-02 11:53:55 -06:00
  • 3358c381f6 logging: Fix creating empty file even when disabled (#2966) Kerfuffle 2023-09-02 11:53:55 -06:00
  • 626da973c4 readme : update clblast instructions (#2903) bandoti 2023-09-02 09:53:18 -03:00
  • 52315a4216 readme : update clblast instructions (#2903) bandoti 2023-09-02 09:53:18 -03:00
  • 837df7e8d2 metal : show all Metal device instances in the system (#2952) Karsten Weiss 2023-09-02 14:29:09 +02:00
  • 8b56b4f2c3 metal : show all Metal device instances in the system (#2952) Karsten Weiss 2023-09-02 14:29:09 +02:00