Commit Graph

  • 356ea17e0f flake.nix: expose checks Someone Serge 2023-12-29 16:21:50 +00:00
  • 198b325606 flake.nix: rocm not yet supported on aarch64, so hide the output Someone Serge 2023-12-26 23:34:40 +00:00
  • a5c088d8c6 flake.nix: rocm not yet supported on aarch64, so hide the output Someone Serge 2023-12-26 23:34:40 +00:00
  • 8f71af30db flake.nix: expose full scope in legacyPackages Someone Serge 2023-12-29 16:15:37 +00:00
  • 1e3900ebac flake.nix: expose full scope in legacyPackages Someone Serge 2023-12-29 16:15:37 +00:00
  • 0f97da9515 ggml : add ggml_vdotq_s32 alias (#4715) Georgi Gerganov 2023-12-31 11:43:31 +02:00
  • e39106c055 ggml : add ggml_vdotq_s32 alias (#4715) Georgi Gerganov 2023-12-31 11:43:31 +02:00
  • 2db1c8f6f2 clip : refactor + bug fixes (#4696) Georgi Gerganov 2023-12-30 23:24:42 +02:00
  • 9fbda719de clip : refactor + bug fixes (#4696) Georgi Gerganov 2023-12-30 23:24:42 +02:00
  • 15800d1753 CUDA: fixed tensor cores not being used on RDNA3 (#4697) Johannes Gäßler 2023-12-30 13:52:01 +01:00
  • 39d8bc71ed CUDA: fixed tensor cores not being used on RDNA3 (#4697) Johannes Gäßler 2023-12-30 13:52:01 +01:00
  • 6394e47300 ggml : add ggml_cpu_has_avx_vnni() (#4589) automaticcat 2023-12-30 15:07:48 +07:00
  • 24a447e20a ggml : add ggml_cpu_has_avx_vnni() (#4589) automaticcat 2023-12-30 15:07:48 +07:00
  • 77eb20b5ce CUDA: fix tensor core logic for Pascal and HIP (#4682) Johannes Gäßler 2023-12-29 23:12:53 +01:00
  • a20f3c7465 CUDA: fix tensor core logic for Pascal and HIP (#4682) Johannes Gäßler 2023-12-29 23:12:53 +01:00
  • 1e12070633 clip : use ggml_backend_buffer_is_host (#4205) Georgi Gerganov 2023-12-29 18:53:34 +02:00
  • 0235b9b571 clip : use ggml_backend_buffer_is_host (#4205) Georgi Gerganov 2023-12-29 18:53:34 +02:00
  • 3226acb20c clip : enable gpu backend (#4205) Steward Garcia 2023-12-29 11:52:15 -05:00
  • ce18d727a4 clip : enable gpu backend (#4205) Steward Garcia 2023-12-29 11:52:15 -05:00
  • 12983ac563 cuda: fix vmm oom issue on NVIDIA AGX Orin (#4687) hydai 2023-12-30 00:31:19 +08:00
  • 91bb39cec7 cuda: fix vmm oom issue on NVIDIA AGX Orin (#4687) hydai 2023-12-30 00:31:19 +08:00
  • b5ad7196d9 python : add check-requirements.sh and GitHub workflow (#4585) crasm 2023-12-29 09:50:29 -05:00
  • 04ac0607e9 python : add check-requirements.sh and GitHub workflow (#4585) crasm 2023-12-29 09:50:29 -05:00
  • 97afbf4d29 flake.nix : rewrite (#4605) Philip Taron 2023-12-29 06:42:26 -08:00
  • 68eccbdc5b flake.nix : rewrite (#4605) Philip Taron 2023-12-29 06:42:26 -08:00
  • 5f8aa28f03 cmake : fix ld warning duplicate libraries libllama.a (#4671) Cuong Trinh Manh 2023-12-29 21:39:15 +07:00
  • 97bbca6e85 cmake : fix ld warning duplicate libraries libllama.a (#4671) Cuong Trinh Manh 2023-12-29 21:39:15 +07:00
  • 4e00adf486 llava-cli : refactor to use sampling library (#4669) Justine Tunney 2023-12-29 06:38:38 -08:00
  • 4af4801566 llava-cli : refactor to use sampling library (#4669) Justine Tunney 2023-12-29 06:38:38 -08:00
  • a2a1f7333e server : replace sleep with condition variables (#4673) Justine Tunney 2023-12-29 06:24:12 -08:00
  • db49ff8ed7 server : replace sleep with condition variables (#4673) Justine Tunney 2023-12-29 06:24:12 -08:00
  • bb6f9cfce2 server : fix OpenAI server sampling w.r.t. penalty. (#4675) SakuraUmi 2023-12-29 22:22:44 +08:00
  • 60f55e888c server : fix OpenAI server sampling w.r.t. penalty. (#4675) SakuraUmi 2023-12-29 22:22:44 +08:00
  • be677135fb server : allow to generate multimodal embeddings (#4681) Karthik Sethuraman 2023-12-29 06:22:10 -08:00
  • b93edd22f5 server : allow to generate multimodal embeddings (#4681) Karthik Sethuraman 2023-12-29 06:22:10 -08:00
  • fe6e204f91 main-cmake-pkg : fix build issue (#4665) andrijdavid 2023-12-29 15:18:20 +01:00
  • 82d6eab224 main-cmake-pkg : fix build issue (#4665) andrijdavid 2023-12-29 15:18:20 +01:00
  • 0f60ba09ce llama.swiftui : fix infinite loop, ouput timings, buff UI (#4674) Peter Sugihara 2023-12-29 05:58:56 -08:00
  • afd997ab60 llama.swiftui : fix infinite loop, ouput timings, buff UI (#4674) Peter Sugihara 2023-12-29 05:58:56 -08:00
  • 2753b503bc scripts : print list of sync commits Georgi Gerganov 2023-12-29 15:12:35 +02:00
  • c8255f8a6b scripts : print list of sync commits Georgi Gerganov 2023-12-29 15:12:35 +02:00
  • 0cb249f0e7 ci : build with CLBlast + ggml-opencl use GGML_API (whisper/1576) Tamotsu Takahashi 2023-12-29 19:23:27 +09:00
  • 441f51dca0 ci : build with CLBlast + ggml-opencl use GGML_API (whisper/1576) Tamotsu Takahashi 2023-12-29 19:23:27 +09:00
  • ff7ec2ba2c sync : ggml Georgi Gerganov 2023-12-29 14:56:41 +02:00
  • 38b3de4658 sync : ggml Georgi Gerganov 2023-12-29 14:56:41 +02:00
  • 3060718bfe ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669) bssrdf 2023-12-29 03:32:31 -05:00
  • afc8c19291 ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669) bssrdf 2023-12-29 03:32:31 -05:00
  • 8c08d65631 scripts : do not sync commits from this repo Georgi Gerganov 2023-12-29 14:41:36 +02:00
  • ca38b8d334 scripts : do not sync commits from this repo Georgi Gerganov 2023-12-29 14:41:36 +02:00
  • ca7d2aabab Fix OpenAI server sampling w.r.t. temp and seed (#4668) Justine Tunney 2023-12-28 11:20:00 -08:00
  • 65e5f6dadb Fix OpenAI server sampling w.r.t. temp and seed (#4668) Justine Tunney 2023-12-28 11:20:00 -08:00
  • d818687d5a gpt2 : Add gpt2 architecture integration (#4555) manikbhandari 2023-12-28 09:03:57 -05:00
  • ea5497df5d gpt2 : Add gpt2 architecture integration (#4555) manikbhandari 2023-12-28 09:03:57 -05:00
  • 28c5cf95d5 llama : add AWQ for llama, llama2, mpt, and mistral models (#4593) Nam D. Tran 2023-12-27 22:39:45 +07:00
  • f6793491b5 llama : add AWQ for llama, llama2, mpt, and mistral models (#4593) Nam D. Tran 2023-12-27 22:39:45 +07:00
  • 766ccb2615 finetune : fix output formatting in print_params (#4653) Daniel Bevenius 2023-12-27 15:16:55 +01:00
  • 879b690a9e finetune : fix output formatting in print_params (#4653) Daniel Bevenius 2023-12-27 15:16:55 +01:00
  • 95dad80615 scripts : add sync-ggml-am.sh Georgi Gerganov 2023-12-27 11:15:31 +02:00
  • b47879b0dd scripts : add sync-ggml-am.sh Georgi Gerganov 2023-12-27 11:15:31 +02:00
  • 352ef0145a ggml : fix dot product for ARM (#4630) Georgi Gerganov 2023-12-27 11:02:13 +02:00
  • 951010fa53 ggml : fix dot product for ARM (#4630) Georgi Gerganov 2023-12-27 11:02:13 +02:00
  • e99ba145fd Add byte token type when tokenizer.model is not exists (#4641) wonjun Jang 2023-12-27 17:37:25 +09:00
  • f56d6077d0 Add byte token type when tokenizer.model is not exists (#4641) wonjun Jang 2023-12-27 17:37:25 +09:00
  • aa98a787da cuda : fix vmm pool with multi GPU (#4620) slaren 2023-12-26 21:23:59 +01:00
  • dc68f0054c cuda : fix vmm pool with multi GPU (#4620) slaren 2023-12-26 21:23:59 +01:00
  • 676177b3ba Update comment for AdamW implementation reference. (#4604) WillCorticesAI 2023-12-26 05:42:08 -05:00
  • de8e496437 Update comment for AdamW implementation reference. (#4604) WillCorticesAI 2023-12-26 05:42:08 -05:00
  • d7b195a340 Fix new CUDA10 compilation errors (#4635) FantasyGmm 2023-12-26 18:38:36 +08:00
  • 77465dad48 Fix new CUDA10 compilation errors (#4635) FantasyGmm 2023-12-26 18:38:36 +08:00
  • fe79201273 Adding Emeltal reference to UI list (#4629) Paul Tsochantaris 2023-12-25 16:09:53 +00:00
  • a206137f92 Adding Emeltal reference to UI list (#4629) Paul Tsochantaris 2023-12-25 16:09:53 +00:00
  • c108921638 simplify bug issue template (#4623) slaren 2023-12-24 21:01:12 +01:00
  • b9f47952ff simplify bug issue template (#4623) slaren 2023-12-24 21:01:12 +01:00
  • 659fe6b867 llama : add PLaMo model (#3557) Shintarou Okada 2023-12-24 22:35:49 +09:00
  • 753be377b6 llama : add PLaMo model (#3557) Shintarou Okada 2023-12-24 22:35:49 +09:00
  • 556d3ccdb7 cuda : improve cuda pool efficiency using virtual memory (#4606) slaren 2023-12-24 14:34:22 +01:00
  • 5bf3953d7e cuda : improve cuda pool efficiency using virtual memory (#4606) slaren 2023-12-24 14:34:22 +01:00
  • 55ddd2af64 fallback to CPU buffer if host buffer alloc fails (#4610) slaren 2023-12-23 16:10:51 +01:00
  • 708e179e85 fallback to CPU buffer if host buffer alloc fails (#4610) slaren 2023-12-23 16:10:51 +01:00
  • c2b5b7b128 ci(docker): fix tags in "Build and push docker image (tagged)" (#4603) Samuel Maynard 2023-12-23 11:35:55 +02:00
  • 925e5584a0 ci(docker): fix tags in "Build and push docker image (tagged)" (#4603) Samuel Maynard 2023-12-23 11:35:55 +02:00
  • 593a2e1be5 server : allow to specify custom prompt for penalty calculation (#3727) Alexey Parfenov 2023-12-23 09:31:49 +00:00
  • 6123979952 server : allow to specify custom prompt for penalty calculation (#3727) Alexey Parfenov 2023-12-23 09:31:49 +00:00
  • 57c6c251f8 grammar : check the full vocab only if necessary (opt) (#4306) kalomaze 2023-12-23 03:27:07 -06:00
  • b9ec82d262 grammar : check the full vocab only if necessary (opt) (#4306) kalomaze 2023-12-23 03:27:07 -06:00
  • 075c1781f6 CUDA: fixed row rounding for 0 tensor splits (#4594) Johannes Gäßler 2023-12-23 09:16:33 +01:00
  • e0a4002273 CUDA: fixed row rounding for 0 tensor splits (#4594) Johannes Gäßler 2023-12-23 09:16:33 +01:00
  • 4f3f1b832f lookup : add prompt lookup decoding example (#4484) LeonEricsson 2023-12-22 17:05:56 +01:00
  • 7082d24cec lookup : add prompt lookup decoding example (#4484) LeonEricsson 2023-12-22 17:05:56 +01:00
  • f47ce05128 sync : ggml (fix im2col) (#4591) Georgi Gerganov 2023-12-22 17:53:43 +02:00
  • ba66175132 sync : ggml (fix im2col) (#4591) Georgi Gerganov 2023-12-22 17:53:43 +02:00
  • ce2c5517e6 cuda : fix jetson compile error (#4560) FantasyGmm 2023-12-22 23:11:12 +08:00
  • a55876955b cuda : fix jetson compile error (#4560) FantasyGmm 2023-12-22 23:11:12 +08:00
  • e7ab935755 Fix CudaMemcpy direction (#4599) Henrik Forstén 2023-12-22 15:34:05 +02:00
  • 6724ef1657 Fix CudaMemcpy direction (#4599) Henrik Forstén 2023-12-22 15:34:05 +02:00
  • 770b4b6cc6 llama : fix platforms without mmap (#4578) slaren 2023-12-22 12:12:53 +01:00
  • 48b7ff193e llama : fix platforms without mmap (#4578) slaren 2023-12-22 12:12:53 +01:00
  • 03ef3ea477 ggml : add comment about backward GGML_OP_DIAG_MASK_INF (#4203) Herman Semenov 2023-12-22 09:26:49 +00:00
  • 48b24b170e ggml : add comment about backward GGML_OP_DIAG_MASK_INF (#4203) Herman Semenov 2023-12-22 09:26:49 +00:00
  • 10c9ac210e make : add LLAMA_HIP_UMA option (#4587) Michael Kesper 2023-12-22 09:03:25 +01:00