Commit Graph

  • a0289f4d24 readme : add Scala 3 bindings repo (#2010) Roman Parykin 2023-06-26 22:47:59 +03:00
  • d38e451578 readme : add Scala 3 bindings repo (#2010) Roman Parykin 2023-06-26 22:47:59 +03:00
  • c81f027351 ggml : increase max tensor name + clean up compiler warnings in train-text (#1988) David Yang 2023-06-27 03:45:32 +08:00
  • eaa6ca5a61 ggml : increase max tensor name + clean up compiler warnings in train-text (#1988) David Yang 2023-06-27 03:45:32 +08:00
  • 69ab1dfe7e readme : LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux (#2007) Gustavo Rocha Dias 2023-06-26 16:34:45 -03:00
  • aa777abbb7 readme : LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux (#2007) Gustavo Rocha Dias 2023-06-26 16:34:45 -03:00
  • fe4f7cd9d9 ggml : avoid conv 2d kernel round up Georgi Gerganov 2023-06-26 21:03:59 +03:00
  • c824d2e368 ggml : avoid conv 2d kernel round up Georgi Gerganov 2023-06-26 21:03:59 +03:00
  • 07565a0f11 ggml : add NUMA support (#1556) zrm 2023-06-26 13:57:59 -04:00
  • b853d45601 ggml : add NUMA support (#1556) zrm 2023-06-26 13:57:59 -04:00
  • eafd3cbeb7 k-quants : fix indentation Georgi Gerganov 2023-06-26 20:10:52 +03:00
  • 9225baef71 k-quants : fix indentation Georgi Gerganov 2023-06-26 20:10:52 +03:00
  • 7458f1729d tests : fix quantize perf (#1990) katsu560 2023-06-27 01:47:02 +09:00
  • a84ab1da8d tests : fix quantize perf (#1990) katsu560 2023-06-27 01:47:02 +09:00
  • fec0ad6df8 k-quants : add AVX support to dot functions (#1916) katsu560 2023-06-27 01:46:07 +09:00
  • 5743ca8092 k-quants : add AVX support to dot functions (#1916) katsu560 2023-06-27 01:46:07 +09:00
  • d936ca1f5d readme : add link to new k-quants for visibility Georgi Gerganov 2023-06-26 19:45:09 +03:00
  • 412c60e473 readme : add link to new k-quants for visibility Georgi Gerganov 2023-06-26 19:45:09 +03:00
  • a9bdb09472 k-quants : support for super-block size of 64 (#2001) Kawrakow 2023-06-26 19:43:07 +03:00
  • 6769e944c7 k-quants : support for super-block size of 64 (#2001) Kawrakow 2023-06-26 19:43:07 +03:00
  • c2c3a4c9f5 Fix assert when free invalid cuda pointer (#2005) Howard Su 2023-06-26 23:15:47 +08:00
  • cbebf61ca7 Fix assert when free invalid cuda pointer (#2005) Howard Su 2023-06-26 23:15:47 +08:00
  • f892eae334 readme : add new roadmap + manifesto Georgi Gerganov 2023-06-25 16:08:12 +03:00
  • 447ccbe8c3 readme : add new roadmap + manifesto Georgi Gerganov 2023-06-25 16:08:12 +03:00
  • bd4f7f9947 ggml : sync latest ggml (custom operators) Georgi Gerganov 2023-06-25 14:25:08 +03:00
  • bd34cdde38 ggml : sync latest ggml (custom operators) Georgi Gerganov 2023-06-25 14:25:08 +03:00
  • 6a2e1be690 fix server sampling: top k sampler first (#1977) anon998 2023-06-25 08:48:36 +00:00
  • c2a08f87b8 fix server sampling: top k sampler first (#1977) anon998 2023-06-25 08:48:36 +00:00
  • b36fe02eb3 readme : add Azure CI discussion link Georgi Gerganov 2023-06-25 09:07:03 +03:00
  • 66a2555ba6 readme : add Azure CI discussion link Georgi Gerganov 2023-06-25 09:07:03 +03:00
  • 278c6002d5 zig : upgrade build system support (#1981) sjinzh 2023-06-25 13:45:44 +08:00
  • e65ca7e14a zig : upgrade build system support (#1981) sjinzh 2023-06-25 13:45:44 +08:00
  • 8ea46699de #1869 Fix null reference errors when training from scratch with CUDA (#1907) Robyn 2023-06-25 04:10:29 +10:00
  • 5ec8dd5a3c #1869 Fix null reference errors when training from scratch with CUDA (#1907) Robyn 2023-06-25 04:10:29 +10:00
  • c64c6d6934 tests : sync test-grad0 from ggml Georgi Gerganov 2023-06-24 19:40:18 +03:00
  • 65bdd52a86 tests : sync test-grad0 from ggml Georgi Gerganov 2023-06-24 19:40:18 +03:00
  • e12d863012 flake : fix ggml-metal.metal path and run nixfmt (#1974) Rowan Hart 2023-06-24 04:07:08 -07:00
  • fdd1860911 flake : fix ggml-metal.metal path and run nixfmt (#1974) Rowan Hart 2023-06-24 04:07:08 -07:00
  • 59cda163a2 convert : fix invalid params in write_vocab_only (#1975) AN Long 2023-06-24 19:02:06 +08:00
  • c943d823c1 convert : fix invalid params in write_vocab_only (#1975) AN Long 2023-06-24 19:02:06 +08:00
  • 5cf93b4546 ggml : improve ggml_graph_dump_dot, add ggml_format_name (#1978) slaren 2023-06-24 12:57:18 +02:00
  • f2c754e1c3 ggml : improve ggml_graph_dump_dot, add ggml_format_name (#1978) slaren 2023-06-24 12:57:18 +02:00
  • a716b97a38 readme : fix whitespaces Georgi Gerganov 2023-06-24 13:38:18 +03:00
  • 11da1a85cd readme : fix whitespaces Georgi Gerganov 2023-06-24 13:38:18 +03:00
  • 2921b46fa7 readme : fixed termux instructions (#1973) Alberto 2023-06-24 12:32:13 +02:00
  • 235b610d65 readme : fixed termux instructions (#1973) Alberto 2023-06-24 12:32:13 +02:00
  • 3a2615d4dd llama : fix top-p sampling to match the canonical definition (#1953) Alex Renda 2023-06-24 03:15:01 -07:00
  • b061ba9e2a llama : fix top-p sampling to match the canonical definition (#1953) Alex Renda 2023-06-24 03:15:01 -07:00
  • dcd8e5a1ee llama : make model stateless and context stateful (llama_state) (#1797) Didzis Gosko 2023-06-24 11:47:58 +03:00
  • 527b6fba1d llama : make model stateless and context stateful (llama_state) (#1797) Didzis Gosko 2023-06-24 11:47:58 +03:00
  • 375b754ed1 Add OpenLLaMA instructions to the README (#1954) eiery 2023-06-23 04:38:01 -04:00
  • d7b7484f74 Add OpenLLaMA instructions to the README (#1954) eiery 2023-06-23 04:38:01 -04:00
  • c3ac2b0992 rework convert.py to read hyper-parameters from config.json (#1958) Erik Scholz 2023-06-22 14:20:47 +02:00
  • 7487137227 rework convert.py to read hyper-parameters from config.json (#1958) Erik Scholz 2023-06-22 14:20:47 +02:00
  • d3fda838f1 cmake: revert CUDA arch default to 52, 61 if f16 (#1959) Johannes Gäßler 2023-06-21 23:49:25 +02:00
  • bbca06e269 cmake: revert CUDA arch default to 52, 61 if f16 (#1959) Johannes Gäßler 2023-06-21 23:49:25 +02:00
  • 0b44dee5f2 Fix typo in README.md (#1961) Rahul Vivek Nair 2023-06-22 03:18:43 +05:30
  • fb98254f99 Fix typo in README.md (#1961) Rahul Vivek Nair 2023-06-22 03:18:43 +05:30
  • 7b293fb3a5 readme : add link to p1 Georgi Gerganov 2023-06-20 19:05:54 +03:00
  • 049aa16b8c readme : add link to p1 Georgi Gerganov 2023-06-20 19:05:54 +03:00
  • 5b5d09ebd1 Fix typo (#1949) Xiake Sun 2023-06-20 05:42:40 -07:00
  • 2322ec223a Fix typo (#1949) Xiake Sun 2023-06-20 05:42:40 -07:00
  • 3ed93c37ba llama : fix params struct slignment (#1936) Ettore Di Giacinto 2023-06-20 03:24:39 +02:00
  • aacdbd4056 llama : fix params struct slignment (#1936) Ettore Di Giacinto 2023-06-20 03:24:39 +02:00
  • 2ae344e563 [Fix] Reenable server embedding endpoint (#1937) Henri Vasserman 2023-06-20 01:12:39 +03:00
  • 20568fe60f [Fix] Reenable server embedding endpoint (#1937) Henri Vasserman 2023-06-20 01:12:39 +03:00
  • 73eab7fee0 ggml : fix bug in LBFGS optimizer (found by ggml tests) Georgi Gerganov 2023-06-19 20:43:30 +03:00
  • 18b35625c3 ggml : fix bug in LBFGS optimizer (found by ggml tests) Georgi Gerganov 2023-06-19 20:43:30 +03:00
  • 2586cd95fb llama : use aligned memory during ggml_init call from loading saved sessions (#1934) l3utterfly 2023-06-19 23:20:06 +08:00
  • ba4e85a833 llama : use aligned memory during ggml_init call from loading saved sessions (#1934) l3utterfly 2023-06-19 23:20:06 +08:00
  • 88524ddc9e cmake : fix trailing whitespaces Georgi Gerganov 2023-06-19 18:18:34 +03:00
  • 23fc5c219a cmake : fix trailing whitespaces Georgi Gerganov 2023-06-19 18:18:34 +03:00
  • a0913e599d llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932) Kawrakow 2023-06-19 18:17:03 +03:00
  • cb40dfca69 llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932) Kawrakow 2023-06-19 18:17:03 +03:00
  • c7ca394001 cuda : faster k-quants on older GPUs (#1930) Kawrakow 2023-06-19 18:14:09 +03:00
  • ca7c3f4da5 cuda : faster k-quants on older GPUs (#1930) Kawrakow 2023-06-19 18:14:09 +03:00
  • 9251060668 ggml : sync latest ggml repo (#1924) Georgi Gerganov 2023-06-19 18:12:33 +03:00
  • b97ca431db ggml : sync latest ggml repo (#1924) Georgi Gerganov 2023-06-19 18:12:33 +03:00
  • 25cc6bb618 cmake : fix build shared ggml when CUDA is enabled (#1929) Howard Su 2023-06-19 23:10:37 +08:00
  • 1e3abfcef0 cmake : fix build shared ggml when CUDA is enabled (#1929) Howard Su 2023-06-19 23:10:37 +08:00
  • f31e154e9c Convert vector to f16 for dequantize mul mat vec (#1913) Johannes Gäßler 2023-06-19 10:23:56 +02:00
  • 16b9cd1939 Convert vector to f16 for dequantize mul mat vec (#1913) Johannes Gäßler 2023-06-19 10:23:56 +02:00
  • 129ce0b495 Added tokens per second to info prints (#1928) Johannes Gäßler 2023-06-18 17:41:26 +02:00
  • b24c3049d9 Added tokens per second to info prints (#1928) Johannes Gäßler 2023-06-18 17:41:26 +02:00
  • 8e92d69064 Fixed incorrectly applying RMS norm twice (#1925) Johannes Gäßler 2023-06-18 16:07:09 +02:00
  • 0ede372a51 Fixed incorrectly applying RMS norm twice (#1925) Johannes Gäßler 2023-06-18 16:07:09 +02:00
  • 4596e69676 ggml : fix bug in ggml_compute_forward_add_q_f32 (#1918) l3utterfly 2023-06-18 19:19:16 +08:00
  • 8596af4277 ggml : fix bug in ggml_compute_forward_add_q_f32 (#1918) l3utterfly 2023-06-18 19:19:16 +08:00
  • c6ed1770b5 readme : update Android build instructions (#1922) Mike 2023-06-18 16:28:26 +08:00
  • e1886cf4fe readme : update Android build instructions (#1922) Mike 2023-06-18 16:28:26 +08:00
  • 60c1ae63b4 llama : prevent usage of k-quants when tensor size is not a multiple of 256 (#1921) Kawrakow 2023-06-18 11:13:43 +03:00
  • 8ab8ba62eb llama : prevent usage of k-quants when tensor size is not a multiple of 256 (#1921) Kawrakow 2023-06-18 11:13:43 +03:00
  • f3a5fa9277 examples : fix examples/metal (#1920) Kawrakow 2023-06-18 10:52:10 +03:00
  • 90cc59d6ab examples : fix examples/metal (#1920) Kawrakow 2023-06-18 10:52:10 +03:00
  • 24763fdd00 metal : handle buffers larger than device's maxBufferLength (#1826) Georgi Gerganov 2023-06-18 09:09:47 +03:00
  • ce2c7d72e2 metal : handle buffers larger than device's maxBufferLength (#1826) Georgi Gerganov 2023-06-18 09:09:47 +03:00
  • 9b0d4f1fc7 cmake : add CUDA_ARCHITECTURES to new target ggml_static (#1917) Howard Su 2023-06-18 12:29:47 +08:00
  • 57cd69460f cmake : add CUDA_ARCHITECTURES to new target ggml_static (#1917) Howard Su 2023-06-18 12:29:47 +08:00
  • dedbae9fae make : do not print help for simple example Georgi Gerganov 2023-06-17 20:55:03 +03:00
  • b2416493ab make : do not print help for simple example Georgi Gerganov 2023-06-17 20:55:03 +03:00