Commit Graph

  • 64dbee2c5e CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • 578d8c8f5c CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • 36be4e8ee8 Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • b541b4f0b1 Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • aa4418d795 Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • 5dbc2b3213 Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • 39897d794c Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • b08e75baea Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • 481ac7803f examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • e6616cf0db examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • 4e89732b50 check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 3aefaab9e5 check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 217da58978 fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 69eb67e282 fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 7d434a09c5 llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • 4fe09dfe66 llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • cf213d4e5b common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • 80291a1d02 common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • 303a4f7baa metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • c6f1491da0 metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • a78476e865 convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • e3d87a6c36 convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • a76ba4384c sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
  • 8c00b7a6ff sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
  • d840da1bcd cmake : fix building shared libs for clang (rocm) on windows (#3176) Engininja2 2023-09-15 06:24:30 -06:00
  • 7e50d34be6 cmake : fix building shared libs for clang (rocm) on windows (#3176) Engininja2 2023-09-15 06:24:30 -06:00
  • 45e8e203c0 flake : use pkg-config instead of pkgconfig (#3188) Evgeny Kurnevsky 2023-09-15 10:10:22 +02:00
  • 235f7c193b flake : use pkg-config instead of pkgconfig (#3188) Evgeny Kurnevsky 2023-09-15 10:10:22 +02:00
  • 40a66e4cbd metal : relax conditions on fast matrix multiplication kernel (#3168) Georgi Gerganov 2023-09-15 11:09:24 +03:00
  • a51b687657 metal : relax conditions on fast matrix multiplication kernel (#3168) Georgi Gerganov 2023-09-15 11:09:24 +03:00
  • 61681fa802 cmake : fix llama.h location when built outside of root directory (#3179) Andrei 2023-09-15 04:07:40 -04:00
  • 76164fe2e6 cmake : fix llama.h location when built outside of root directory (#3179) Andrei 2023-09-15 04:07:40 -04:00
  • d97a2124e4 ci : Cloud-V for RISC-V builds (#3160) Ali Tariq 2023-09-15 13:06:56 +05:00
  • c2ab6fe661 ci : Cloud-V for RISC-V builds (#3160) Ali Tariq 2023-09-15 13:06:56 +05:00
  • 0d44b199f5 llama : remove mtest (#3177) Roland 2023-09-15 03:28:45 -04:00
  • 2d770505a8 llama : remove mtest (#3177) Roland 2023-09-15 03:28:45 -04:00
  • 2e7c1af3c6 llama : make quantize example up to 2.7x faster (#3115) Cebtenzzre 2023-09-14 21:09:53 -04:00
  • 98311c4277 llama : make quantize example up to 2.7x faster (#3115) Cebtenzzre 2023-09-14 21:09:53 -04:00
  • 144cf127a8 flake : allow $out/include to already exist (#3175) jneem 2023-09-14 13:54:47 -05:00
  • feea179e9f flake : allow $out/include to already exist (#3175) jneem 2023-09-14 13:54:47 -05:00
  • 7940ea477b cmake : compile ggml-rocm with -fpic when building shared library (#3158) Andrei 2023-09-14 13:38:16 -04:00
  • 769266a543 cmake : compile ggml-rocm with -fpic when building shared library (#3158) Andrei 2023-09-14 13:38:16 -04:00
  • 0ca44fb618 flake : include llama.h in nix output (#3159) Asbjørn Olling 2023-09-14 19:25:00 +02:00
  • cf8238e7f4 flake : include llama.h in nix output (#3159) Asbjørn Olling 2023-09-14 19:25:00 +02:00
  • a52b504b10 make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) Cebtenzzre 2023-09-14 13:22:47 -04:00
  • 4b8560e72a make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) Cebtenzzre 2023-09-14 13:22:47 -04:00
  • d2b63333a9 CI: add FreeBSD & simplify CUDA windows (#3053) Alon 2023-09-14 20:21:25 +03:00
  • 83a53b753a CI: add FreeBSD & simplify CUDA windows (#3053) Alon 2023-09-14 20:21:25 +03:00
  • f84a2cab35 falcon : use stated vocab size (#2914) akawrykow 2023-09-14 10:19:42 -07:00
  • 5c872dbca2 falcon : use stated vocab size (#2914) akawrykow 2023-09-14 10:19:42 -07:00
  • deaffb425b cmake : add relocatable Llama package (#2960) bandoti 2023-09-14 14:04:40 -03:00
  • 990a5e226a cmake : add relocatable Llama package (#2960) bandoti 2023-09-14 14:04:40 -03:00
  • 61cead9a5b docker : add gpu image CI builds (#3103) dylan 2023-09-14 09:47:00 -07:00
  • 980ab41afb docker : add gpu image CI builds (#3103) dylan 2023-09-14 09:47:00 -07:00
  • e7990974a7 gguf-py : support identity operation in TensorNameMap (#3095) Kerfuffle 2023-09-14 10:32:26 -06:00
  • e394084166 gguf-py : support identity operation in TensorNameMap (#3095) Kerfuffle 2023-09-14 10:32:26 -06:00
  • dc3fa2a06b feature : support Baichuan serial models (#3009) jameswu2014 2023-09-15 00:32:10 +08:00
  • 4c8643dd6e feature : support Baichuan serial models (#3009) jameswu2014 2023-09-15 00:32:10 +08:00
  • a3482ee1f6 speculative : add heuristic algorithm (#3006) Leng Yue 2023-09-14 09:14:44 -07:00
  • 35f73049af speculative : add heuristic algorithm (#3006) Leng Yue 2023-09-14 09:14:44 -07:00
  • e58d136bfb whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096) goerch 2023-09-13 15:19:44 +02:00
  • 71ca2fad7d whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096) goerch 2023-09-13 15:19:44 +02:00
  • fefd1c924f cmake : add a compiler flag check for FP16 format (#3086) Tristan Ross 2023-09-13 06:08:52 -07:00
  • 1b6c650d16 cmake : add a compiler flag check for FP16 format (#3086) Tristan Ross 2023-09-13 06:08:52 -07:00
  • a1999004f0 CUDA: mul_mat_q RDNA2 tunings (#2910) Johannes Gäßler 2023-09-13 11:20:24 +02:00
  • 0a5eebb45d CUDA: mul_mat_q RDNA2 tunings (#2910) Johannes Gäßler 2023-09-13 11:20:24 +02:00
  • c2c04893d8 speculative: add --n-gpu-layers-draft option (#3063) FK 2023-09-13 08:50:46 +02:00
  • 84e723653c speculative: add --n-gpu-layers-draft option (#3063) FK 2023-09-13 08:50:46 +02:00
  • c87a266056 arm64 support for windows (#3007) Eric Sommerlade 2023-09-13 02:54:20 +01:00
  • b52b29ab9d arm64 support for windows (#3007) Eric Sommerlade 2023-09-13 02:54:20 +01:00
  • 99793f90fe CUDA: fix LoRAs (#3130) Johannes Gäßler 2023-09-13 00:15:33 +02:00
  • 4f7cd6ba9c CUDA: fix LoRAs (#3130) Johannes Gäßler 2023-09-13 00:15:33 +02:00
  • 54ef27dff3 CUDA: fix mul_mat_q not used for output tensor (#3127) Johannes Gäßler 2023-09-11 22:58:41 +02:00
  • 89e89599fd CUDA: fix mul_mat_q not used for output tensor (#3127) Johannes Gäßler 2023-09-11 22:58:41 +02:00
  • f4cfe0248a CUDA: lower GPU latency + fix Windows performance (#3110) Johannes Gäßler 2023-09-11 19:55:51 +02:00
  • d54a4027a6 CUDA: lower GPU latency + fix Windows performance (#3110) Johannes Gäßler 2023-09-11 19:55:51 +02:00
  • e6d674b4cd cmake : support build for iOS/tvOS (#3116) Jhen-Jie Hong 2023-09-11 19:49:06 +08:00
  • 1b0d09259e cmake : support build for iOS/tvOS (#3116) Jhen-Jie Hong 2023-09-11 19:49:06 +08:00
  • 36e69da6ba CUDA: add device number to error messages (#3112) Johannes Gäßler 2023-09-11 13:00:24 +02:00
  • 8a4ca9af56 CUDA: add device number to error messages (#3112) Johannes Gäßler 2023-09-11 13:00:24 +02:00
  • ea9d12efb9 metal : PP speedup (#3084) Kawrakow 2023-09-11 09:30:11 +02:00
  • f31b6f4e2d metal : PP speedup (#3084) Kawrakow 2023-09-11 09:30:11 +02:00
  • dc63829b88 convert: remove most of the n_mult usage in convert.py (#3098) Erik Scholz 2023-09-10 17:06:53 +02:00
  • 6eeb4d9083 convert: remove most of the n_mult usage in convert.py (#3098) Erik Scholz 2023-09-10 17:06:53 +02:00
  • 1b0b14195c metal : support for Swift (#3078) kchro3 2023-09-09 02:12:10 -07:00
  • 21ac3a1503 metal : support for Swift (#3078) kchro3 2023-09-09 02:12:10 -07:00
  • b49ecb484c metal : support build for iOS/tvOS (#3089) Jhen-Jie Hong 2023-09-09 16:46:04 +08:00
  • 4fd5477955 metal : support build for iOS/tvOS (#3089) Jhen-Jie Hong 2023-09-09 16:46:04 +08:00
  • 0741785341 flake : add train-text-from-scratch to flake.nix (#3042) takov751 2023-09-08 17:06:26 +01:00
  • ec2a24fedf flake : add train-text-from-scratch to flake.nix (#3042) takov751 2023-09-08 17:06:26 +01:00
  • 8db00f111b readme : fix typo (#3043) Ikko Eltociear Ashimine 2023-09-09 01:04:32 +09:00
  • 7d99aca759 readme : fix typo (#3043) Ikko Eltociear Ashimine 2023-09-09 01:04:32 +09:00
  • d6f89750ed metal : Q3_K speedup (#2995) Kawrakow 2023-09-08 18:01:04 +02:00
  • ba7ffbb251 metal : Q3_K speedup (#2995) Kawrakow 2023-09-08 18:01:04 +02:00
  • 1d8cda5518 examples : make n_ctx warning work again (#3066) Cebtenzzre 2023-09-08 11:43:35 -04:00
  • e64f5b5578 examples : make n_ctx warning work again (#3066) Cebtenzzre 2023-09-08 11:43:35 -04:00
  • e0997d46fe readme : update hot tpoics Georgi Gerganov 2023-09-08 18:18:04 +03:00
  • 94f10b91ed readme : update hot tpoics Georgi Gerganov 2023-09-08 18:18:04 +03:00
  • ba1b0a362d sync : ggml (CUDA GLM RoPE + POSIX) (#3082) Georgi Gerganov 2023-09-08 17:58:07 +03:00
  • b3e9852e47 sync : ggml (CUDA GLM RoPE + POSIX) (#3082) Georgi Gerganov 2023-09-08 17:58:07 +03:00