Commit Graph

  • 28cb35a0ec make : add LLAMA_HIP_UMA option (#4587) Michael Kesper 2023-12-22 09:03:25 +01:00
  • 0d1176daa9 ci : tag docker image with build number (#4584) rhuddleston 2023-12-21 23:56:34 -07:00
  • f31b984898 ci : tag docker image with build number (#4584) rhuddleston 2023-12-21 23:56:34 -07:00
  • 55c7355ee0 readme : add zig bindings (#4581) Deins 2023-12-22 08:49:54 +02:00
  • 2bb98279c5 readme : add zig bindings (#4581) Deins 2023-12-22 08:49:54 +02:00
  • 99f545dc77 ggml : extend enum ggml_log_level with GGML_LOG_LEVEL_DEBUG (#4579) bobqianic 2023-12-22 06:47:01 +00:00
  • 0137ef88ea ggml : extend enum ggml_log_level with GGML_LOG_LEVEL_DEBUG (#4579) bobqianic 2023-12-22 06:47:01 +00:00
  • 2766ad3491 llama : add ability to cancel model loading (#4462) crasm 2023-12-22 01:19:36 -05:00
  • c7e9701f86 llama : add ability to cancel model loading (#4462) crasm 2023-12-22 01:19:36 -05:00
  • f330ea5c2e ggml : change ggml_scale to take a float instead of tensor (#4573) Georgi Gerganov 2023-12-21 23:20:49 +02:00
  • afefa319f1 ggml : change ggml_scale to take a float instead of tensor (#4573) Georgi Gerganov 2023-12-21 23:20:49 +02:00
  • b737eca163 gguf-py : fix broken link Georgi Gerganov 2023-12-21 23:20:36 +02:00
  • 769a7bc85e gguf-py : fix broken link Georgi Gerganov 2023-12-21 23:20:36 +02:00
  • 4c919cc3e8 gguf : simplify example dependencies Georgi Gerganov 2023-12-21 23:07:58 +02:00
  • 32259b2dad gguf : simplify example dependencies Georgi Gerganov 2023-12-21 23:07:58 +02:00
  • a4f5d18157 ci : add jlumbroso/free-disk-space to docker workflow (#4150) Samuel Maynard 2023-12-21 22:36:26 +02:00
  • 4a5f9d629e ci : add jlumbroso/free-disk-space to docker workflow (#4150) Samuel Maynard 2023-12-21 22:36:26 +02:00
  • f5aeec7ecc llama : initial ggml-backend integration (#4520) slaren 2023-12-21 21:07:46 +01:00
  • d232aca5a7 llama : initial ggml-backend integration (#4520) slaren 2023-12-21 21:07:46 +01:00
  • 77618b25cb llama : allow getting n_batch from llama_context in c api (#4540) Marcus Dunn 2023-12-21 11:57:48 -08:00
  • 31f27758fa llama : allow getting n_batch from llama_context in c api (#4540) Marcus Dunn 2023-12-21 11:57:48 -08:00
  • eb4082d6d5 metal : fix ggml_metal_log vargs (#4373) Finn Voorhees 2023-12-21 14:55:02 -05:00
  • 56fa50819f metal : fix ggml_metal_log vargs (#4373) Finn Voorhees 2023-12-21 14:55:02 -05:00
  • 824b8257cd cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449) Erik Garrison 2023-12-21 13:45:32 -06:00
  • 0f630fbc92 cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449) Erik Garrison 2023-12-21 13:45:32 -06:00
  • bc570152f3 ggml-cuda: Fix HIP build by adding define for __trap (#4569) arlo-phoenix 2023-12-21 20:13:25 +01:00
  • 562cf222b5 ggml-cuda: Fix HIP build by adding define for __trap (#4569) arlo-phoenix 2023-12-21 20:13:25 +01:00
  • a20481ad6e common : remove incorrect --model-draft default (#4568) Jared Van Bortel 2023-12-21 12:55:34 -05:00
  • 8fe03ffdda common : remove incorrect --model-draft default (#4568) Jared Van Bortel 2023-12-21 12:55:34 -05:00
  • 0b657fce5a CUDA: mul_mat_id always on GPU for batches >= 32 (#4553) Johannes Gäßler 2023-12-21 18:42:59 +01:00
  • 9154494808 CUDA: mul_mat_id always on GPU for batches >= 32 (#4553) Johannes Gäßler 2023-12-21 18:42:59 +01:00
  • 946602a70f readme : update coding guidelines Georgi Gerganov 2023-12-21 19:27:14 +02:00
  • c083718c89 readme : update coding guidelines Georgi Gerganov 2023-12-21 19:27:14 +02:00
  • e534fb63f8 py : open merges file as 'utf-8' (#4566) howlger 2023-12-21 18:07:34 +01:00
  • 880e352277 py : open merges file as 'utf-8' (#4566) howlger 2023-12-21 18:07:34 +01:00
  • 90a3a7c7b0 cuda : better error message for ggml_get_rows (#4561) bobqianic 2023-12-21 17:06:44 +00:00
  • 66f35a2f48 cuda : better error message for ggml_get_rows (#4561) bobqianic 2023-12-21 17:06:44 +00:00
  • 5442df81ee cuda : replace asserts in wrong architecture checks with __trap (#4556) slaren 2023-12-21 18:02:30 +01:00
  • 1398823922 cuda : replace asserts in wrong architecture checks with __trap (#4556) slaren 2023-12-21 18:02:30 +01:00
  • 825bb4313f llama : disable per-tensor info prints on model load (#4562) Johannes Gäßler 2023-12-21 17:34:17 +01:00
  • d3223afdad llama : disable per-tensor info prints on model load (#4562) Johannes Gäßler 2023-12-21 17:34:17 +01:00
  • 81e16fdd89 Fix access violation in ggml_cuda_free_data if tensor->extra is NULL (#4554) LoganDark 2023-12-21 01:59:27 -08:00
  • 1d7a1912ce Fix access violation in ggml_cuda_free_data if tensor->extra is NULL (#4554) LoganDark 2023-12-21 01:59:27 -08:00
  • 164d9c4ddf CUDA: Faster Mixtral prompt processing (#4538) Johannes Gäßler 2023-12-20 15:41:22 +01:00
  • 799fc22689 CUDA: Faster Mixtral prompt processing (#4538) Johannes Gäßler 2023-12-20 15:41:22 +01:00
  • 2f7bcd10dd ggml : fixed check for _MSC_VER (#4535) Eric Sommerlade 2023-12-19 16:17:01 +00:00
  • 328b83de23 ggml : fixed check for _MSC_VER (#4535) Eric Sommerlade 2023-12-19 16:17:01 +00:00
  • 79d2a09f0f ggml-cuda: Fix HIP build (#4528) arlo-phoenix 2023-12-18 22:33:45 +01:00
  • a7aee47b98 ggml-cuda: Fix HIP build (#4528) arlo-phoenix 2023-12-18 22:33:45 +01:00
  • 7a72042b8f llama.swiftui : add tinyllama 1.1B F16 Georgi Gerganov 2023-12-18 20:17:43 +02:00
  • 0e18b2e7d0 llama.swiftui : add tinyllama 1.1B F16 Georgi Gerganov 2023-12-18 20:17:43 +02:00
  • 8e9f54e3e2 llama.swiftui : add more models Georgi Gerganov 2023-12-18 20:05:12 +02:00
  • 6ff39b129d llama.swiftui : add more models Georgi Gerganov 2023-12-18 20:05:12 +02:00
  • f2fb940bce llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490) Ebey Abraham 2023-12-18 17:27:47 +00:00
  • b9e74f9bca llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490) Ebey Abraham 2023-12-18 17:27:47 +00:00
  • 7f4d1160a6 llama : fix try_override for bool_value which always return true (#4519) hankcs 2023-12-18 05:14:58 -08:00
  • 3c04bf6da8 llama : fix try_override for bool_value which always return true (#4519) hankcs 2023-12-18 05:14:58 -08:00
  • 67cb84e944 decode : fix logits_valid for legacy API (#4516) Jared Van Bortel 2023-12-17 19:39:02 -05:00
  • 2994f0c5a2 decode : fix logits_valid for legacy API (#4516) Jared Van Bortel 2023-12-17 19:39:02 -05:00
  • a88ba9a360 readme : update hot topics Georgi Gerganov 2023-12-17 20:16:23 +02:00
  • b1306c4394 readme : update hot topics Georgi Gerganov 2023-12-17 20:16:23 +02:00
  • 6851c8fb39 llama.swiftui : add bench functionality (#4483) Georgi Gerganov 2023-12-17 19:38:41 +02:00
  • 800a489e4a llama.swiftui : add bench functionality (#4483) Georgi Gerganov 2023-12-17 19:38:41 +02:00
  • fc2fe18d08 gguf-py : fail fast on nonsensical special token IDs (#4489) Jared Van Bortel 2023-12-17 10:45:46 -05:00
  • f7f468a97d gguf-py : fail fast on nonsensical special token IDs (#4489) Jared Van Bortel 2023-12-17 10:45:46 -05:00
  • b758c12cfd build : Check the ROCm installation location (#4485) Matheus Gabriel Alves Silva 2023-12-17 12:23:33 -03:00
  • 919c40660f build : Check the ROCm installation location (#4485) Matheus Gabriel Alves Silva 2023-12-17 12:23:33 -03:00
  • 4994747b7f finetune : keep allocs alive until all allocations are done (#4486) slaren 2023-12-17 16:05:56 +01:00
  • 45668633fd finetune : keep allocs alive until all allocations are done (#4486) slaren 2023-12-17 16:05:56 +01:00
  • 1f6c89aa4e server : disable llm logs if SERVER_VERBOSE is off (#3792) olexiyb 2023-12-17 17:02:16 +02:00
  • 0ffc92d2d2 server : disable llm logs if SERVER_VERBOSE is off (#3792) olexiyb 2023-12-17 17:02:16 +02:00
  • 25469cab7f server : fix grammar being ignored (#4494) AdithyanI 2023-12-17 15:57:56 +01:00
  • 8edd2b40fd server : fix grammar being ignored (#4494) AdithyanI 2023-12-17 15:57:56 +01:00
  • 6c9141e794 server : fix possible ambiguity in content type charset (#4501) Alexey Parfenov 2023-12-17 14:56:09 +00:00
  • eb16dae7e7 server : fix possible ambiguity in content type charset (#4501) Alexey Parfenov 2023-12-17 14:56:09 +00:00
  • 690e6659ed server : allow requests larger than 8K (#4500) mzcu 2023-12-17 15:54:37 +01:00
  • 62bd52b7bf server : allow requests larger than 8K (#4500) mzcu 2023-12-17 15:54:37 +01:00
  • a957bfa577 Link to cublas dynamically on Windows even with LLAMA_STATIC (#4506) Bach Le 2023-12-17 18:57:33 +08:00
  • 5daa5f54fd Link to cublas dynamically on Windows even with LLAMA_STATIC (#4506) Bach Le 2023-12-17 18:57:33 +08:00
  • 35be198672 lora : add support for non-llama models (#3333) slaren 2023-12-16 18:58:46 +01:00
  • c6c4fc081c lora : add support for non-llama models (#3333) slaren 2023-12-16 18:58:46 +01:00
  • a3bbcd5bdd llama : sanity checks for access to logits (#4274) Jared Van Bortel 2023-12-15 22:16:15 -05:00
  • 8a5be3bd58 llama : sanity checks for access to logits (#4274) Jared Van Bortel 2023-12-15 22:16:15 -05:00
  • 81cae950cd server : add optional API Key Authentication example (#4441) ShadovvBeast 2023-12-15 13:49:01 +02:00
  • 88ae8952b6 server : add optional API Key Authentication example (#4441) ShadovvBeast 2023-12-15 13:49:01 +02:00
  • 06114054d9 ggml : group mul_mat_id rows by matrix (cpu only) (#4480) slaren 2023-12-15 12:45:50 +01:00
  • ee4725a686 ggml : group mul_mat_id rows by matrix (cpu only) (#4480) slaren 2023-12-15 12:45:50 +01:00
  • 7ebfee1eed ggml : use ggml_row_size where possible (#4472) slaren 2023-12-14 20:05:21 +01:00
  • 6744dbe924 ggml : use ggml_row_size where possible (#4472) slaren 2023-12-14 20:05:21 +01:00
  • ca79db8866 ggml : remove n_dims from ggml_tensor (#4469) slaren 2023-12-14 16:52:08 +01:00
  • cafcd4f895 ggml : remove n_dims from ggml_tensor (#4469) slaren 2023-12-14 16:52:08 +01:00
  • e8de7365f6 py : add protobuf dependency (#4466) wonjun Jang 2023-12-14 21:44:49 +09:00
  • c50e400163 py : add protobuf dependency (#4466) wonjun Jang 2023-12-14 21:44:49 +09:00
  • e6ddbb28d3 ggml : add ggml_row_size() (fixes llama out of space) (#4461) LostRuins 2023-12-14 20:13:33 +08:00
  • 20a68a7030 ggml : add ggml_row_size() (fixes llama out of space) (#4461) LostRuins 2023-12-14 20:13:33 +08:00
  • f06035b532 ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453) Georgi Gerganov 2023-12-14 10:35:29 +02:00
  • 55e87c3749 ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453) Georgi Gerganov 2023-12-14 10:35:29 +02:00
  • 2d09da6079 convert : support loading vocab from fast tokenizer config (#3633) wonjun Jang 2023-12-14 17:09:34 +09:00
  • 873637afc7 convert : support loading vocab from fast tokenizer config (#3633) wonjun Jang 2023-12-14 17:09:34 +09:00
  • db6c6b68b7 readme : update supported model list (#4457) BarfingLemurs 2023-12-14 02:38:49 -05:00