Commit Graph

  • 35a2ee9143 Remove unused data and add fixes (#5154) Michael Klimenko 2024-01-27 15:25:55 +01:00
  • 6bceee244b server : add self-extend support (#5104) Maximilian Winter 2024-01-27 14:38:05 +01:00
  • ec903c0341 server : add self-extend support (#5104) Maximilian Winter 2024-01-27 14:38:05 +01:00
  • 98890616e2 Add OpenCL add kernel (#5151) 0cc4m 2024-01-26 23:07:32 +01:00
  • a1d6df129b Add OpenCL add kernel (#5151) 0cc4m 2024-01-26 23:07:32 +01:00
  • 4742bda9a2 cmake : pass CPU architecture flags to nvcc (#5146) Jared Van Bortel 2024-01-26 15:34:06 -05:00
  • bbe7c56c99 cmake : pass CPU architecture flags to nvcc (#5146) Jared Van Bortel 2024-01-26 15:34:06 -05:00
  • 7cef06abe4 cuda : fix tensor size calculation for non-split buffer (#5145) slaren 2024-01-26 18:59:43 +01:00
  • 62fead3ea0 cuda : fix tensor size calculation for non-split buffer (#5145) slaren 2024-01-26 18:59:43 +01:00
  • ef75d9aa87 ggml-alloc : add 10% margin to the buffer sizes (#5149) slaren 2024-01-26 18:18:26 +01:00
  • 15b4538ff2 ggml-alloc : add 10% margin to the buffer sizes (#5149) slaren 2024-01-26 18:18:26 +01:00
  • 1ca08650a3 ggml : update softmax n_task calculation (#5126) snadampal 2024-01-26 11:17:59 -06:00
  • 7032f4f634 ggml : update softmax n_task calculation (#5126) snadampal 2024-01-26 11:17:59 -06:00
  • 8289eb006a scripts : move run-with-preset.py from root to scripts folder Georgi Gerganov 2024-01-26 17:09:44 +02:00
  • 5f1925a8ce scripts : move run-with-preset.py from root to scripts folder Georgi Gerganov 2024-01-26 17:09:44 +02:00
  • 0321caf69f tests : gitignore test-c.o Georgi Gerganov 2024-01-26 14:48:15 +02:00
  • 3b7c914de2 tests : gitignore test-c.o Georgi Gerganov 2024-01-26 14:48:15 +02:00
  • 88bd8be65e server : refactored the task processing logic (#5065) Xuan Son Nguyen 2024-01-26 13:42:20 +01:00
  • 48c857aa10 server : refactored the task processing logic (#5065) Xuan Son Nguyen 2024-01-26 13:42:20 +01:00
  • f4cc7db364 ci : add model tests + script wrapper (#4586) crasm 2024-01-26 07:18:00 -05:00
  • 413e7b0559 ci : add model tests + script wrapper (#4586) crasm 2024-01-26 07:18:00 -05:00
  • 7ace32cd24 metal : remove unused n_buffers and buffers (#5129) Paul Tsochantaris 2024-01-26 12:16:07 +00:00
  • 6dd3c28c9c metal : remove unused n_buffers and buffers (#5129) Paul Tsochantaris 2024-01-26 12:16:07 +00:00
  • 1004b730b1 gguf : fix "general.alignment" type in gguf_reader.py (#5136) Riceball LEE 2024-01-26 17:10:28 +08:00
  • 38b431de23 gguf : fix "general.alignment" type in gguf_reader.py (#5136) Riceball LEE 2024-01-26 17:10:28 +08:00
  • 174ed70c97 readme : update hot topics Georgi Gerganov 2024-01-26 10:52:33 +02:00
  • aad0b01d73 readme : update hot topics Georgi Gerganov 2024-01-26 10:52:33 +02:00
  • 2e0ebe6a22 Another bucket sort (#5109) Kawrakow 2024-01-26 09:14:39 +02:00
  • 1182cf4d4f Another bucket sort (#5109) Kawrakow 2024-01-26 09:14:39 +02:00
  • b8a55f4398 readme : add MobileVLM 1.7B/3B to the supported models list (#5107) XiaotaoChen 2024-01-26 04:14:32 +08:00
  • fe54033b69 readme : add MobileVLM 1.7B/3B to the supported models list (#5107) XiaotaoChen 2024-01-26 04:14:32 +08:00
  • c6e551b2a3 llama : dynamic temperature sampling (#4972) l3utterfly 2024-01-26 05:06:22 +09:00
  • 5eaf9964fc llama : dynamic temperature sampling (#4972) l3utterfly 2024-01-26 05:06:22 +09:00
  • c30495f453 examples : make pydantic scripts pass mypy and support py3.8 (#5099) Jared Van Bortel 2024-01-25 14:51:24 -05:00
  • d292f4f204 examples : make pydantic scripts pass mypy and support py3.8 (#5099) Jared Van Bortel 2024-01-25 14:51:24 -05:00
  • f3e045ffad android : use release cmake build type by default (#5123) Valentin Konovalov 2024-01-25 12:05:51 -05:00
  • 256d1bb0dd android : use release cmake build type by default (#5123) Valentin Konovalov 2024-01-25 12:05:51 -05:00
  • 2da9f1c37a Fix Q3_K_XS for MoE models (#5113) Kawrakow 2024-01-25 17:58:53 +02:00
  • faa3526a1e Fix Q3_K_XS for MoE models (#5113) Kawrakow 2024-01-25 17:58:53 +02:00
  • d42d77976d metal : show compile log messages Georgi Gerganov 2024-01-25 11:26:17 +02:00
  • ddc5a5033f metal : show compile log messages Georgi Gerganov 2024-01-25 11:26:17 +02:00
  • f569578ccc cuda : fix 2-bit quants on amd hip (#5105) Engininja2 2024-01-24 16:18:15 -06:00
  • cd4fddb29f cuda : fix 2-bit quants on amd hip (#5105) Engininja2 2024-01-24 16:18:15 -06:00
  • 14b72fa90a nix-shell: use addToSearchPath Michael Hueschen 2024-01-22 16:44:10 -07:00
  • c9b316c78f nix-shell: use addToSearchPath Michael Hueschen 2024-01-22 16:44:10 -07:00
  • e12a06272d nix: add cc to devShell LD_LIBRARY_PATH Michael Hueschen 2024-01-22 03:17:05 -07:00
  • bf63d695b8 nix: add cc to devShell LD_LIBRARY_PATH Michael Hueschen 2024-01-22 03:17:05 -07:00
  • ab0c5dbd6d llama : pre-allocate input tensors in a separate buffer (#5100) slaren 2024-01-24 12:48:14 +01:00
  • 1387ea2117 llama : pre-allocate input tensors in a separate buffer (#5100) slaren 2024-01-24 12:48:14 +01:00
  • a4ce5bf351 metal : disable support for MUL_MAT F32 x F16 Georgi Gerganov 2024-01-23 15:50:56 +02:00
  • 26d607608d metal : disable support for MUL_MAT F32 x F16 Georgi Gerganov 2024-01-23 15:50:56 +02:00
  • 07be9cef49 Additional KL-divergence statistics (#5081) Kawrakow 2024-01-23 15:17:20 +02:00
  • 44879ee885 Additional KL-divergence statistics (#5081) Kawrakow 2024-01-23 15:17:20 +02:00
  • fa690025e6 CUDA: more info when no device code (#5088) Johannes Gäßler 2024-01-23 13:31:56 +01:00
  • 9ecdd12e95 CUDA: more info when no device code (#5088) Johannes Gäßler 2024-01-23 13:31:56 +01:00
  • 0beb2d8bf4 minor : clean-up some warnings and style (#5094) Georgi Gerganov 2024-01-23 14:12:57 +02:00
  • 89758723c7 minor : clean-up some warnings and style (#5094) Georgi Gerganov 2024-01-23 14:12:57 +02:00
  • 8bb43a2380 devops : add intel oneapi dockerfile (#5068) Xuan Son Nguyen 2024-01-23 08:11:39 +01:00
  • 2bed4aa3f3 devops : add intel oneapi dockerfile (#5068) Xuan Son Nguyen 2024-01-23 08:11:39 +01:00
  • 05e68851a2 llama.vim : added api key support (#5090) Michael Coppola 2024-01-23 01:51:27 -05:00
  • 125d03a503 llama.vim : added api key support (#5090) Michael Coppola 2024-01-23 01:51:27 -05:00
  • 85013d185e llama : fix not enough space in buffer with Qwen (#5086) slaren 2024-01-22 23:42:41 +01:00
  • 011e8ec577 llama : fix not enough space in buffer with Qwen (#5086) slaren 2024-01-22 23:42:41 +01:00
  • 21124f8250 KL-divergence (#5076) Kawrakow 2024-01-22 16:10:14 +02:00
  • 6f9939d119 KL-divergence (#5076) Kawrakow 2024-01-22 16:10:14 +02:00
  • db23c1e61b ggml : parallelize FP32 conversion when using BLAS (#5045) Reinforce-II 2024-01-22 21:15:08 +08:00
  • 780e24a22e ggml : parallelize FP32 conversion when using BLAS (#5045) Reinforce-II 2024-01-22 21:15:08 +08:00
  • 27a6a3d428 llava : MobileVLM support (#4954) XiaotaoChen 2024-01-22 21:09:35 +08:00
  • 3ce7e8f8e7 llava : MobileVLM support (#4954) XiaotaoChen 2024-01-22 21:09:35 +08:00
  • 7cf6f6f7e7 flake.nix: add a comment about flakes vs nix Someone Serge 2024-01-21 03:41:37 +00:00
  • b2d80e105a flake.nix: add a comment about flakes vs nix Someone Serge 2024-01-21 03:41:37 +00:00
  • 1ff9757668 nix: add a comment on the many nixpkgs-with-cuda instances Someone Serge 2024-01-21 03:29:38 +00:00
  • 28603cd283 nix: add a comment on the many nixpkgs-with-cuda instances Someone Serge 2024-01-21 03:29:38 +00:00
  • f622bb7e14 nix: add a comment about makeScope Someone Serge 2024-01-21 03:15:13 +00:00
  • 5e97ec91ae nix: add a comment about makeScope Someone Serge 2024-01-21 03:15:13 +00:00
  • ec81abd9a5 nix: refactor the cleanSource rules Someone Serge 2024-01-13 17:45:01 +00:00
  • 7251870780 nix: refactor the cleanSource rules Someone Serge 2024-01-13 17:45:01 +00:00
  • b9f0b6782d workflows: nix-ci: drop the redundant "paths" filter Someone Serge 2024-01-13 17:38:32 +00:00
  • fe8b3c0d4b workflows: nix-ci: drop the redundant "paths" filter Someone Serge 2024-01-13 17:38:32 +00:00
  • 0146a1a253 workflows: nix-build-aarch64: rate limit Someone Serge 2024-01-13 17:16:54 +00:00
  • f4dd059259 workflows: nix-build-aarch64: rate limit Someone Serge 2024-01-13 17:16:54 +00:00
  • fbceda0636 workflows: nix-ci: rebuild on flake.lock updates Someone Serge 2024-01-13 17:10:19 +00:00
  • f7276f7500 workflows: nix-ci: rebuild on flake.lock updates Someone Serge 2024-01-13 17:10:19 +00:00
  • c394fe969c imatrix : keep intermediate imatrix results (#5077) Kawrakow 2024-01-22 14:18:43 +02:00
  • 15bceec2d7 imatrix : keep intermediate imatrix results (#5077) Kawrakow 2024-01-22 14:18:43 +02:00
  • 9cfd9f45ca llama : support StableLM 2 1.6B (#5052) compilade 2024-01-22 06:21:52 -05:00
  • d6bd4d46dd llama : support StableLM 2 1.6B (#5052) compilade 2024-01-22 06:21:52 -05:00
  • 0244a6ceb3 finetune : print sample-start/include-sample-start (#5072) Daniel Bevenius 2024-01-22 12:11:01 +01:00
  • 152d9d05e0 finetune : print sample-start/include-sample-start (#5072) Daniel Bevenius 2024-01-22 12:11:01 +01:00
  • 27f6120aa2 llama : add Q3_K_XS (#5060) Kawrakow 2024-01-22 12:43:33 +02:00
  • 66d575c45c llama : add Q3_K_XS (#5060) Kawrakow 2024-01-22 12:43:33 +02:00
  • 4aac1d433e ci : fix Windows CI by updating Intel SDE version (#5053) bobqianic 2024-01-22 08:55:05 +00:00
  • 57744932c6 ci : fix Windows CI by updating Intel SDE version (#5053) bobqianic 2024-01-22 08:55:05 +00:00
  • a37bce0e93 llama : add more qwen2 models (#5071) Shijie 2024-01-22 15:33:19 +08:00
  • 3466c6ebcf llama : add more qwen2 models (#5071) Shijie 2024-01-22 15:33:19 +08:00
  • 3ffdaca35d Revert LLAMA_NATIVE to OFF in flake.nix (#5066) iSma 2024-01-21 22:37:13 +01:00
  • 504dc37be8 Revert LLAMA_NATIVE to OFF in flake.nix (#5066) iSma 2024-01-21 22:37:13 +01:00
  • a727920ce6 add safetensors support to convert-lora-to-ggml.py (#5062) kuronekosaiko 2024-01-22 00:28:14 +08:00
  • 05490fad7f add safetensors support to convert-lora-to-ggml.py (#5062) kuronekosaiko 2024-01-22 00:28:14 +08:00
  • 4a2cf46fe6 add #include <string> to unicode.h (#5051) bobqianic 2024-01-21 15:17:35 +00:00