Commit Graph

  • e8de7365f6 py : add protobuf dependency (#4466) wonjun Jang 2023-12-14 21:44:49 +09:00
  • c50e400163 py : add protobuf dependency (#4466) wonjun Jang 2023-12-14 21:44:49 +09:00
  • e6ddbb28d3 ggml : add ggml_row_size() (fixes llama out of space) (#4461) LostRuins 2023-12-14 20:13:33 +08:00
  • 20a68a7030 ggml : add ggml_row_size() (fixes llama out of space) (#4461) LostRuins 2023-12-14 20:13:33 +08:00
  • f06035b532 ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453) Georgi Gerganov 2023-12-14 10:35:29 +02:00
  • 55e87c3749 ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453) Georgi Gerganov 2023-12-14 10:35:29 +02:00
  • 2d09da6079 convert : support loading vocab from fast tokenizer config (#3633) wonjun Jang 2023-12-14 17:09:34 +09:00
  • 873637afc7 convert : support loading vocab from fast tokenizer config (#3633) wonjun Jang 2023-12-14 17:09:34 +09:00
  • db6c6b68b7 readme : update supported model list (#4457) BarfingLemurs 2023-12-14 02:38:49 -05:00
  • 0353a18401 readme : update supported model list (#4457) BarfingLemurs 2023-12-14 02:38:49 -05:00
  • 92c5a97cf2 server : fix handling of characters that span multiple tokens when streaming (#4446) shibe2 2023-12-13 23:57:15 +04:00
  • 948ff137ec server : fix handling of characters that span multiple tokens when streaming (#4446) shibe2 2023-12-13 23:57:15 +04:00
  • 1711aed04e sync : ggml (SD ops, tests, kernels) (#4444) Georgi Gerganov 2023-12-13 21:54:54 +02:00
  • 4d98d9a656 sync : ggml (SD ops, tests, kernels) (#4444) Georgi Gerganov 2023-12-13 21:54:54 +02:00
  • 4aad625587 build : detect host compiler and cuda compiler separately (#4414) Jared Van Bortel 2023-12-13 12:10:10 -05:00
  • 70f806b821 build : detect host compiler and cuda compiler separately (#4414) Jared Van Bortel 2023-12-13 12:10:10 -05:00
  • 5c623ac8f3 common : add --version option to show build info in CLI (#4433) Siwen Yu 2023-12-13 20:50:14 +08:00
  • 9fb13f9584 common : add --version option to show build info in CLI (#4433) Siwen Yu 2023-12-13 20:50:14 +08:00
  • 07dc8ed2c2 readme : update hot topics Georgi Gerganov 2023-12-13 14:05:38 +02:00
  • 113f9942fc readme : update hot topics Georgi Gerganov 2023-12-13 14:05:38 +02:00
  • caf5f9e6dd llama : add Mixtral support (#4406) slaren 2023-12-13 13:04:25 +01:00
  • 799a1cb13b llama : add Mixtral support (#4406) slaren 2023-12-13 13:04:25 +01:00
  • bf286d0526 server : tweak default sampling parameters (#4367) kalomaze 2023-12-12 04:12:35 -06:00
  • fecac45658 server : tweak default sampling parameters (#4367) kalomaze 2023-12-12 04:12:35 -06:00
  • 1249c107d3 english : use typos to fix comments and logs (#4354) Richard Kiss 2023-12-12 01:53:36 -08:00
  • 9494d7c477 english : use typos to fix comments and logs (#4354) Richard Kiss 2023-12-12 01:53:36 -08:00
  • 217f9ca1d4 build : target Windows 8 for standard mingw-w64 (#4405) Jared Van Bortel 2023-12-12 04:27:26 -05:00
  • 6138963fb2 build : target Windows 8 for standard mingw-w64 (#4405) Jared Van Bortel 2023-12-12 04:27:26 -05:00
  • 728dd33a46 llama : document logits_all deprecation (#4418) crasm 2023-12-12 04:25:57 -05:00
  • 6391817cd1 llama : document logits_all deprecation (#4418) crasm 2023-12-12 04:25:57 -05:00
  • 93aa8edb9f server : fix local model name in server (#4420) Vladimir Zorin 2023-12-12 11:25:29 +02:00
  • d9d4cfef64 server : fix local model name in server (#4420) Vladimir Zorin 2023-12-12 11:25:29 +02:00
  • 62d88b8839 ggml : increased GGML_MAX_PARAMS to allow finetuning of 70b models (#4424) Taikono-Himazin 2023-12-12 18:24:32 +09:00
  • 41a11aaf99 ggml : increased GGML_MAX_PARAMS to allow finetuning of 70b models (#4424) Taikono-Himazin 2023-12-12 18:24:32 +09:00
  • a43b1ac59b Update README.md (#4388) Yueh-Po Peng 2023-12-11 06:27:38 +08:00
  • 8a7b2fa528 Update README.md (#4388) Yueh-Po Peng 2023-12-11 06:27:38 +08:00
  • ff053217b5 grammar : revert the replacement of llama_token_to_piece with id_to_token (#4396) Xiang (Kevin) Li 2023-12-09 16:29:27 -05:00
  • e18f7345a3 grammar : revert the replacement of llama_token_to_piece with id_to_token (#4396) Xiang (Kevin) Li 2023-12-09 16:29:27 -05:00
  • 8a8220f13a sync : ggml (new ops, tests, backend, etc.) (#4359) Georgi Gerganov 2023-12-07 22:26:54 +02:00
  • fe680e3d10 sync : ggml (new ops, tests, backend, etc.) (#4359) Georgi Gerganov 2023-12-07 22:26:54 +02:00
  • 86bfbd3afc llama : per-layer KV cache + quantum K cache (#4309) Georgi Gerganov 2023-12-07 13:03:17 +02:00
  • bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309) Georgi Gerganov 2023-12-07 13:03:17 +02:00
  • c1d99cafb7 train : fix #4227 (double free in examples/train-text-from-scratch/train-text-from-scratch.cpp) (#4351) Hongyu Ouyang 2023-12-07 02:25:22 -08:00
  • 81bc9214a3 train : fix #4227 (double free in examples/train-text-from-scratch/train-text-from-scratch.cpp) (#4351) Hongyu Ouyang 2023-12-07 02:25:22 -08:00
  • 602cf4fd3a server : recognize cache_prompt parameter in OAI API (#4347) Georgi Gerganov 2023-12-06 20:21:59 +02:00
  • 05cd6e5036 server : recognize cache_prompt parameter in OAI API (#4347) Georgi Gerganov 2023-12-06 20:21:59 +02:00
  • 1ff71c6ab8 common : fix compile warning Georgi Gerganov 2023-12-06 10:41:03 +02:00
  • caa9249217 common : fix compile warning Georgi Gerganov 2023-12-06 10:41:03 +02:00
  • f098f30b29 speculative : support --color (#4343) stduhpf 2023-12-06 09:08:17 +01:00
  • da5eaef1f3 speculative : support --color (#4343) stduhpf 2023-12-06 09:08:17 +01:00
  • d2c79f6425 grammar : pre-computed pieces + reserve mem + less string copies (#4330) Marcus Dunn 2023-12-05 10:55:12 -10:00
  • 5f6e0c0dff grammar : pre-computed pieces + reserve mem + less string copies (#4330) Marcus Dunn 2023-12-05 10:55:12 -10:00
  • f30452a7b8 llama : allow overriding GGUF metadata when loading model (#4092) Kerfuffle 2023-12-05 10:19:18 -07:00
  • 5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092) Kerfuffle 2023-12-05 10:19:18 -07:00
  • 6a3e340d1e sampling : custom samplers order (#4285) MaggotHATE 2023-12-05 15:05:51 +05:00
  • 52c8bc3cf3 sampling : custom samplers order (#4285) MaggotHATE 2023-12-05 15:05:51 +05:00
  • 970ec26a7d swift : revert compiler checks for swift package (#4332) kchro3 2023-12-04 23:29:46 -08:00
  • e4b76bbe31 swift : revert compiler checks for swift package (#4332) kchro3 2023-12-04 23:29:46 -08:00
  • 0e6885a7ac simple : update error message for KV cache check (#4324) Daniel Bevenius 2023-12-04 17:04:21 +01:00
  • 23b5e12eb5 simple : update error message for KV cache check (#4324) Daniel Bevenius 2023-12-04 17:04:21 +01:00
  • ca44b588eb swift : fix concatenation method to avoid invalid UTF8 stringfication (#4325) Miwa / Ensan 2023-12-05 01:03:49 +09:00
  • d208995c6d swift : fix concatenation method to avoid invalid UTF8 stringfication (#4325) Miwa / Ensan 2023-12-05 01:03:49 +09:00
  • 79d6bdf363 swift : fix prompt tokenization logic (#4321) Miwa / Ensan 2023-12-04 22:43:45 +09:00
  • 5c9f90cba1 swift : fix prompt tokenization logic (#4321) Miwa / Ensan 2023-12-04 22:43:45 +09:00
  • b8018e6f5e grammar-parser : fix typo (#4318) Ikko Eltociear Ashimine 2023-12-04 16:57:35 +09:00
  • 4fa44e84ad grammar-parser : fix typo (#4318) Ikko Eltociear Ashimine 2023-12-04 16:57:35 +09:00
  • b13db52bf4 ggml : reuse ggml_get_n_tasks() in ggml_graph_plan() (#4308) Georgi Gerganov 2023-12-03 15:56:35 +02:00
  • fbbc42827b ggml : reuse ggml_get_n_tasks() in ggml_graph_plan() (#4308) Georgi Gerganov 2023-12-03 15:56:35 +02:00
  • 594f810424 ggml : fix soft max out-of-bounds access (#4307) Georgi Gerganov 2023-12-03 15:56:22 +02:00
  • adf3de4f69 ggml : fix soft max out-of-bounds access (#4307) Georgi Gerganov 2023-12-03 15:56:22 +02:00
  • 748ed27910 server : fix OpenAI API stop field to be optional (#4299) Ed Lee 2023-12-03 01:10:43 -08:00
  • 33e171d1e9 server : fix OpenAI API stop field to be optional (#4299) Ed Lee 2023-12-03 01:10:43 -08:00
  • 1999041306 py : add grammar to oai like api (#4294) Rickard Edén 2023-12-03 10:03:25 +01:00
  • 6949b50df5 py : add grammar to oai like api (#4294) Rickard Edén 2023-12-03 10:03:25 +01:00
  • 110487aa7b llama : pad KV cache size (#4280) Georgi Gerganov 2023-12-03 10:58:16 +02:00
  • d7b800b8bc llama : pad KV cache size (#4280) Georgi Gerganov 2023-12-03 10:58:16 +02:00
  • ea03843eac llama : avoid using "optional" keyword (#4283) Georgi Gerganov 2023-12-01 20:39:12 +02:00
  • 5a7d3125e7 llama : avoid using "optional" keyword (#4283) Georgi Gerganov 2023-12-01 20:39:12 +02:00
  • f742a36093 llama : support optional tensors (#4283) Georgi Gerganov 2023-12-01 20:35:03 +02:00
  • d5a1cbde60 llama : support optional tensors (#4283) Georgi Gerganov 2023-12-01 20:35:03 +02:00
  • 620a06de72 swift : fix token_to_piece implementation (#4278) Miwa / Ensan 2023-12-02 03:19:45 +09:00
  • b220222a64 swift : fix token_to_piece implementation (#4278) Miwa / Ensan 2023-12-02 03:19:45 +09:00
  • 55e92243c5 build : enable libstdc++ assertions for debug builds (#4275) Jared Van Bortel 2023-12-01 13:18:35 -05:00
  • 511f52c334 build : enable libstdc++ assertions for debug builds (#4275) Jared Van Bortel 2023-12-01 13:18:35 -05:00
  • c739f636e2 llama : support attention bias on LLaMA architecture (#4283) CausalLM 2023-12-02 02:17:06 +08:00
  • 03562f3a86 llama : support attention bias on LLaMA architecture (#4283) CausalLM 2023-12-02 02:17:06 +08:00
  • d049e2716b llama : add Qwen support (#4281) Shijie 2023-12-02 02:16:31 +08:00
  • 37c746d687 llama : add Qwen support (#4281) Shijie 2023-12-02 02:16:31 +08:00
  • b9d704f4f6 llama : fix integer overflow during quantization (#4284) Georgi Gerganov 2023-12-01 18:42:11 +02:00
  • 880f57973b llama : fix integer overflow during quantization (#4284) Georgi Gerganov 2023-12-01 18:42:11 +02:00
  • b901ad16c3 py : add requirements file for convert-hf-to-gguf.py (#4277) Daniel Bevenius 2023-12-01 10:41:56 +01:00
  • 8d6d9f033b py : add requirements file for convert-hf-to-gguf.py (#4277) Daniel Bevenius 2023-12-01 10:41:56 +01:00
  • 88dbc5a1e9 ggml : add ggml_soft_max_ext (#4256) Georgi Gerganov 2023-12-01 10:51:24 +02:00
  • ef47ec18da ggml : add ggml_soft_max_ext (#4256) Georgi Gerganov 2023-12-01 10:51:24 +02:00
  • 49e1009f75 server : add --log-disable to disable logging to file (#4260) Ziad Ben Hadj-Alouane 2023-11-30 17:25:49 -05:00
  • 1d144112c0 server : add --log-disable to disable logging to file (#4260) Ziad Ben Hadj-Alouane 2023-11-30 17:25:49 -05:00
  • 66ee2cd128 server : add single-client multi-prompt support (#4232) Ziad Ben Hadj-Alouane 2023-11-30 17:25:04 -05:00
  • f43f09366d server : add single-client multi-prompt support (#4232) Ziad Ben Hadj-Alouane 2023-11-30 17:25:04 -05:00
  • a8cce8e1b5 make : fix Apple clang determination bug (#4272) WillCorticesAI 2023-11-30 17:23:44 -05:00
  • d2809a3ba2 make : fix Apple clang determination bug (#4272) WillCorticesAI 2023-11-30 17:23:44 -05:00