Commit Graph

  • 54770413c4 ggml : fix MIN / MAX macros (#6904) Georgi Gerganov 2024-04-25 15:12:28 +03:00
  • 4ce9087ea4 tests : minor bash stuff (#6902) Georgi Gerganov 2024-04-25 14:27:20 +03:00
  • aa750c1ede tests : minor bash stuff (#6902) Georgi Gerganov 2024-04-25 14:27:20 +03:00
  • 1d00f348a3 quantize : add '--keep-split' to quantize model into shards (#6688) jiez 2024-04-25 18:29:35 +08:00
  • 1966eb2615 quantize : add '--keep-split' to quantize model into shards (#6688) jiez 2024-04-25 18:29:35 +08:00
  • 4adf11a516 README: add graphic for matrix multiplication (#6881) Johannes Gäßler 2024-04-24 21:29:13 +02:00
  • 784e11dea1 README: add graphic for matrix multiplication (#6881) Johannes Gäßler 2024-04-24 21:29:13 +02:00
  • 246a40656f llama : add llama_get_pooling_type function (#6862) Douglas Hanley 2024-04-24 08:10:07 -05:00
  • b4e4b8a935 llama : add llama_get_pooling_type function (#6862) Douglas Hanley 2024-04-24 08:10:07 -05:00
  • 3c0cd473a3 server : do not apply Markdown formatting in code sections (#6850) mgroeber9110 2024-04-24 12:54:24 +02:00
  • 3fe847b574 server : do not apply Markdown formatting in code sections (#6850) mgroeber9110 2024-04-24 12:54:24 +02:00
  • 0c91502144 common : revert showing control tokens by default for server (#6860) Kyle Mistele 2024-04-24 05:15:29 -05:00
  • 37246b1031 common : revert showing control tokens by default for server (#6860) Kyle Mistele 2024-04-24 05:15:29 -05:00
  • 95878a0936 Server: fix seed for multiple slots (#6835) Johannes Gäßler 2024-04-24 11:08:36 +02:00
  • 28103f4832 Server: fix seed for multiple slots (#6835) Johannes Gäßler 2024-04-24 11:08:36 +02:00
  • 0afad31e7c ggml : move 32-bit arm compat in ggml-impl.h (#6865) Georgi Gerganov 2024-04-24 12:00:07 +03:00
  • c0d1b3e03e ggml : move 32-bit arm compat in ggml-impl.h (#6865) Georgi Gerganov 2024-04-24 12:00:07 +03:00
  • 83276d898e llama : add phi 3 chat template (#6857) Tristan Druyen 2024-04-24 10:52:37 +02:00
  • abd3314064 llama : add phi 3 chat template (#6857) Tristan Druyen 2024-04-24 10:52:37 +02:00
  • e528175412 convert : add support of codeqwen due to tokenizer (#6707) Junyang Lin 2024-04-24 15:16:21 +08:00
  • 3fec68be4e convert : add support of codeqwen due to tokenizer (#6707) Junyang Lin 2024-04-24 15:16:21 +08:00
  • 3e35ff21bc llama : add phi3 support (#6852) liuwei-git 2024-04-24 15:00:37 +08:00
  • c8297c6af5 llama : add phi3 support (#6852) liuwei-git 2024-04-24 15:00:37 +08:00
  • 9ee1d89add [SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 flag activated (#6767) Anas Ahouzi 2024-04-23 02:53:18 +02:00
  • 4e96a812b3 [SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 flag activated (#6767) Anas Ahouzi 2024-04-23 02:53:18 +02:00
  • 589ae25b03 llamafile : improve sgemm.cpp (#6796) Justine Tunney 2024-04-22 15:00:36 -04:00
  • 192090bae4 llamafile : improve sgemm.cpp (#6796) Justine Tunney 2024-04-22 15:00:36 -04:00
  • 5db69dc275 ggml : fix calloc argument ordering. (#6820) Dave Airlie 2024-04-23 00:05:06 +10:00
  • e931888d50 ggml : fix calloc argument ordering. (#6820) Dave Airlie 2024-04-23 00:05:06 +10:00
  • 8b5a604de0 llama : fix typo in <|im_end|> token text (#6745) Georgi Gerganov 2024-04-22 15:41:11 +03:00
  • 8960fe86ae llama : fix typo in <|im_end|> token text (#6745) Georgi Gerganov 2024-04-22 15:41:11 +03:00
  • 563887f3b9 ci: fix job are cancelling each other (#6781) Pierrick Hymbert 2024-04-22 13:22:54 +02:00
  • c0956b09ba ci: fix job are cancelling each other (#6781) Pierrick Hymbert 2024-04-22 13:22:54 +02:00
  • 8edcd08fd2 flake.lock: Update github-actions[bot] 2024-04-21 00:17:47 +00:00
  • e9b4a1bf68 flake.lock: Update github-actions[bot] 2024-04-21 00:17:47 +00:00
  • 6e744919ab build: generate hex dump of server assets during build (#6661) Olivier Chafik 2024-04-21 18:48:53 +01:00
  • 5cf5e7d490 build: generate hex dump of server assets during build (#6661) Olivier Chafik 2024-04-21 18:48:53 +01:00
  • 078be8d46e llama : add option to render special/control tokens (#6807) Georgi Gerganov 2024-04-21 18:36:45 +03:00
  • 40f74e4d73 llama : add option to render special/control tokens (#6807) Georgi Gerganov 2024-04-21 18:36:45 +03:00
  • 2c75d6d5f9 ggml : fix ggml_backend_cpu_supports_op() for CPY (#0) Georgi Gerganov 2024-04-21 16:47:57 +03:00
  • b9cc76d87e ggml : fix ggml_backend_cpu_supports_op() for CPY (#0) Georgi Gerganov 2024-04-21 16:47:57 +03:00
  • 4f78cc5825 llama : add llama-3 chat template (#6751) Wouter 2024-04-21 15:03:39 +02:00
  • 7dbdba5690 llama : add llama-3 chat template (#6751) Wouter 2024-04-21 15:03:39 +02:00
  • 250586b4a9 gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761) pmysl 2024-04-21 14:49:30 +02:00
  • c1386c936e gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761) pmysl 2024-04-21 14:49:30 +02:00
  • 7270b7c21b doc : add link to falcon (#6789) Jan Boon 2024-04-21 20:35:40 +08:00
  • e8d35f47cb doc : add link to falcon (#6789) Jan Boon 2024-04-21 20:35:40 +08:00
  • e022ce3d5d readme : add Fedora instructions (#6783) Mohammadreza Hendiani 2024-04-21 16:02:05 +03:30
  • 2cca09d509 readme : add Fedora instructions (#6783) Mohammadreza Hendiani 2024-04-21 16:02:05 +03:30
  • 772ceb7a4d llava : use logger in llava-cli (#6797) Justine Tunney 2024-04-21 08:19:04 -04:00
  • 89b0bf0d5d llava : use logger in llava-cli (#6797) Justine Tunney 2024-04-21 08:19:04 -04:00
  • a512902ca8 llama : support Llama 3 HF conversion (#6745) Pedro Cuenca 2024-04-21 13:50:41 +02:00
  • b97bc3966e llama : support Llama 3 HF conversion (#6745) Pedro Cuenca 2024-04-21 13:50:41 +02:00
  • c6756e52e3 doc : server tests require llama to be built with curl enabled (#6788) Jan Boon 2024-04-21 00:29:50 +08:00
  • b8109bc013 doc : server tests require llama to be built with curl enabled (#6788) Jan Boon 2024-04-21 00:29:50 +08:00
  • 2e13ed400f common : try to fix Android CI (#6780) Georgi Gerganov 2024-04-20 13:27:12 +03:00
  • aed82f6837 common : try to fix Android CI (#6780) Georgi Gerganov 2024-04-20 13:27:12 +03:00
  • ba48614c84 ci: add ubuntu latest release and fix missing build number (mac & ubuntu) (#6748) loonerin 2024-04-19 13:03:35 -04:00
  • 0e4802b2ec ci: add ubuntu latest release and fix missing build number (mac & ubuntu) (#6748) loonerin 2024-04-19 13:03:35 -04:00
  • 6ca3de069a server: static: upstream upgrade (#6765) Pierrick Hymbert 2024-04-19 13:19:01 +02:00
  • 637e9a86c2 server: static: upstream upgrade (#6765) Pierrick Hymbert 2024-04-19 13:19:01 +02:00
  • 460a782b1c Implement the OLMo architecture (#6741) nopperl 2024-04-19 09:35:54 +00:00
  • 9958c81b79 Implement the OLMo architecture (#6741) nopperl 2024-04-19 09:35:54 +00:00
  • a405f4096f train : add general name (#6752) Austin 2024-04-19 03:16:45 -04:00
  • 8b1b1f4982 train : add general name (#6752) Austin 2024-04-19 03:16:45 -04:00
  • b48c843e9b fix wrong parameter in cmd in readme-sycl.md (#6755) Neo Zhang 2024-04-19 09:16:31 +08:00
  • bca40e9814 fix wrong parameter in cmd in readme-sycl.md (#6755) Neo Zhang 2024-04-19 09:16:31 +08:00
  • b266290ee8 ggml : group all experts in a single ggml_mul_mat_id (#6505) slaren 2024-04-18 15:18:48 +02:00
  • 0d56246f4b ggml : group all experts in a single ggml_mul_mat_id (#6505) slaren 2024-04-18 15:18:48 +02:00
  • 42484e2e6c convert : support models with multiple chat templates (#6588) Sigbjørn Skjæret 2024-04-18 13:49:01 +02:00
  • 03c0946d73 convert : support models with multiple chat templates (#6588) Sigbjørn Skjæret 2024-04-18 13:49:01 +02:00
  • e68489b935 Qwen2 : assume tied weights if lm_head/output weights is missing (#6738) Ren Xuancheng 2024-04-18 19:38:04 +08:00
  • e11b2e6e1e Qwen2 : assume tied weights if lm_head/output weights is missing (#6738) Ren Xuancheng 2024-04-18 19:38:04 +08:00
  • 7f8e883934 llama : fix compatibility with old 2 expert models (#6735) slaren 2024-04-18 09:04:47 +02:00
  • c71bfd736e llama : fix compatibility with old 2 expert models (#6735) slaren 2024-04-18 09:04:47 +02:00
  • 81b41d0219 llamafile : tmp disable + build sgemm.o when needed (#6716) Georgi Gerganov 2024-04-17 23:58:26 +03:00
  • 3b8f1ec4b1 llamafile : tmp disable + build sgemm.o when needed (#6716) Georgi Gerganov 2024-04-17 23:58:26 +03:00
  • dde3dc2fa9 readme : add UI (#6724) Yaroslav 2024-04-17 14:47:50 +02:00
  • 8dd1ec8b3f readme : add UI (#6724) Yaroslav 2024-04-17 14:47:50 +02:00
  • 76a155daa7 convert : fix autoawq gemma (#6704) Zheng.Deng 2024-04-17 04:51:07 +08:00
  • facb8b56f8 convert : fix autoawq gemma (#6704) Zheng.Deng 2024-04-17 04:51:07 +08:00
  • ae5a9eec38 llama : make general.name optional (#6709) Georgi Gerganov 2024-04-16 23:50:38 +03:00
  • 532c1737a1 llama : make general.name optional (#6709) Georgi Gerganov 2024-04-16 23:50:38 +03:00
  • 245033c500 ggml : fix llamafile sgemm wdata offsets (#6710) Georgi Gerganov 2024-04-16 23:50:22 +03:00
  • 666867b799 ggml : fix llamafile sgemm wdata offsets (#6710) Georgi Gerganov 2024-04-16 23:50:22 +03:00
  • ce8b5426be ggml : add llamafile sgemm (#6414) Justine Tunney 2024-04-16 14:55:30 -04:00
  • 8cc91dc63c ggml : add llamafile sgemm (#6414) Justine Tunney 2024-04-16 14:55:30 -04:00
  • f758b2eea4 llama : add StableLM2 12B (#6635) Ashish 2024-04-16 08:48:35 -07:00
  • dbceec87c0 llama : add StableLM2 12B (#6635) Ashish 2024-04-16 08:48:35 -07:00
  • c037100422 llama : add qwen2moe (#6074) Shijie 2024-04-16 23:40:48 +08:00
  • f4dea7da18 llama : add qwen2moe (#6074) Shijie 2024-04-16 23:40:48 +08:00
  • 7666643d2f gritlm : add --outdir option to hf.sh script (#6699) Daniel Bevenius 2024-04-16 08:34:06 +02:00
  • 8a56075b07 gritlm : add --outdir option to hf.sh script (#6699) Daniel Bevenius 2024-04-16 08:34:06 +02:00
  • c8f0401af4 perplexity : require positive --ctx-size arg (#6695) Georgi Gerganov 2024-04-16 09:28:33 +03:00
  • 58227ffdeb perplexity : require positive --ctx-size arg (#6695) Georgi Gerganov 2024-04-16 09:28:33 +03:00
  • ecc2398add gguf : add special tokens metadata for FIM/Infill (#6689) Daniel Bevenius 2024-04-16 08:13:13 +02:00
  • 4fbd8098e6 gguf : add special tokens metadata for FIM/Infill (#6689) Daniel Bevenius 2024-04-16 08:13:13 +02:00
  • 5e3332d0b7 main: add --json-schema / -j flag (#6659) Olivier Chafik 2024-04-15 18:35:21 +01:00
  • 7593639ce3 main: add --json-schema / -j flag (#6659) Olivier Chafik 2024-04-15 18:35:21 +01:00
  • eda6b25287 llama : fix restoring the number of outputs from state files (#6687) compilade 2024-04-15 08:56:55 -04:00