Commit Graph

  • 0907ab79e4 Only show -ngl option when relevant + other doc/arg handling updates (#1625) Kerfuffle 2023-05-28 11:48:57 -06:00
  • 1b78ed2081 Only show -ngl option when relevant + other doc/arg handling updates (#1625) Kerfuffle 2023-05-28 11:48:57 -06:00
  • 2de97bcba2 examples : add --alias option to gpt_params to set use friendly model name (#1614) Vladimir Zorin 2023-05-28 20:14:24 +03:00
  • 337aea1139 examples : add --alias option to gpt_params to set use friendly model name (#1614) Vladimir Zorin 2023-05-28 20:14:24 +03:00
  • 139240d596 opencl : no need to allocate cl_mem on heap (#1612) Howard Su 2023-05-29 01:13:36 +08:00
  • bb051d9723 opencl : no need to allocate cl_mem on heap (#1612) Howard Su 2023-05-29 01:13:36 +08:00
  • b4e11a1e94 opencl : use strstr to check if fp16 supported (#1611) Howard Su 2023-05-29 01:09:56 +08:00
  • ca74884f66 opencl : use strstr to check if fp16 supported (#1611) Howard Su 2023-05-29 01:09:56 +08:00
  • 9ef1c5c76e ggml : add support for the RISCV architecture (#1616) apcameron 2023-05-27 21:03:25 +01:00
  • a6704643b6 ggml : add support for the RISCV architecture (#1616) apcameron 2023-05-27 21:03:25 +01:00
  • f989fcc1ac Include server in releases + other build system cleanups (#1610) Kerfuffle 2023-05-27 11:04:14 -06:00
  • 0df7d63e5b Include server in releases + other build system cleanups (#1610) Kerfuffle 2023-05-27 11:04:14 -06:00
  • c31487b41a Add documentation about CLBlast (#1604) Henri Vasserman 2023-05-27 18:47:55 +03:00
  • 97c9b77c4f Add documentation about CLBlast (#1604) Henri Vasserman 2023-05-27 18:47:55 +03:00
  • 6a7be3e13b [CI] Fix openblas (#1613) Henri Vasserman 2023-05-27 17:24:06 +03:00
  • 0ecb1bbbeb [CI] Fix openblas (#1613) Henri Vasserman 2023-05-27 17:24:06 +03:00
  • 621bc1797b ggml : add ggml_tensor_overhead() Georgi Gerganov 2023-05-27 16:19:56 +03:00
  • 93618031c7 ggml : add ggml_tensor_overhead() Georgi Gerganov 2023-05-27 16:19:56 +03:00
  • 2097ef1048 [CI] CLBlast: Fix directory name (#1606) Henri Vasserman 2023-05-27 15:18:25 +03:00
  • 83c54e6da5 [CI] CLBlast: Fix directory name (#1606) Henri Vasserman 2023-05-27 15:18:25 +03:00
  • 97a29d7fab ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name()) Georgi Gerganov 2023-05-27 12:22:05 +03:00
  • bdbda1b17a ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name()) Georgi Gerganov 2023-05-27 12:22:05 +03:00
  • b9242e02d9 Some improvements to loading the session with --prompt-cache (#1550) Kerfuffle 2023-05-25 20:18:01 -06:00
  • 66874d4fbc Some improvements to loading the session with --prompt-cache (#1550) Kerfuffle 2023-05-25 20:18:01 -06:00
  • 3723b0b6d9 cuda : performance optimizations (#1530) Johannes Gäßler 2023-05-25 23:07:29 +02:00
  • 1fcdcc28b1 cuda : performance optimizations (#1530) Johannes Gäßler 2023-05-25 23:07:29 +02:00
  • 88c9a3ad7b Update CLBlast to 1.6.0 (#1580) Henri Vasserman 2023-05-24 10:30:09 +03:00
  • ac7876ac20 Update CLBlast to 1.6.0 (#1580) Henri Vasserman 2023-05-24 10:30:09 +03:00
  • 5da20efb48 readme : add docs for chat-persistent.sh (#1568) Evan Jones 2023-05-24 02:24:01 -04:00
  • c31bbe934b readme : add docs for chat-persistent.sh (#1568) Evan Jones 2023-05-24 02:24:01 -04:00
  • d01a7872c2 chat-persistent.sh : use bracket expressions in grep (#1564) Senemu 2023-05-24 06:16:22 +00:00
  • 1359b6aba5 chat-persistent.sh : use bracket expressions in grep (#1564) Senemu 2023-05-24 06:16:22 +00:00
  • ce89052a49 Fix handling of "invalid property" when creating OpenCL command queue (#1565) Maarten ter Huurne 2023-05-23 18:01:15 +02:00
  • 7d873811f3 Fix handling of "invalid property" when creating OpenCL command queue (#1565) Maarten ter Huurne 2023-05-23 18:01:15 +02:00
  • 3955ecde57 OpenCL Token Generation Acceleration (#1459) 0cc4m 2023-05-22 23:33:24 +02:00
  • 2e6cd4b025 OpenCL Token Generation Acceleration (#1459) 0cc4m 2023-05-22 23:33:24 +02:00
  • 5030c1df22 examples : add server example with REST API (#1443) Steward Garcia 2023-05-21 11:51:18 -06:00
  • 7e4ea5beff examples : add server example with REST API (#1443) Steward Garcia 2023-05-21 11:51:18 -06:00
  • 40cb847d7b make : .PHONY clean (#1553) Stefan Sydow 2023-05-21 16:03:44 +02:00
  • 7780e4f479 make : .PHONY clean (#1553) Stefan Sydow 2023-05-21 16:03:44 +02:00
  • 3d48caa554 ggml : output 3d sizes in ggml_graph_dump_dot() Georgi Gerganov 2023-05-21 11:56:23 +03:00
  • 265db9834e ggml : output 3d sizes in ggml_graph_dump_dot() Georgi Gerganov 2023-05-21 11:56:23 +03:00
  • c6a3e8aaff ggml : update WASM SIMD Georgi Gerganov 2023-05-20 20:00:41 +03:00
  • fab49c685e ggml : update WASM SIMD Georgi Gerganov 2023-05-20 20:00:41 +03:00
  • 3d683d65fc feature : support blis and other blas implementation (#1536) Zenix 2023-05-20 23:58:31 +09:00
  • b8ee340abe feature : support blis and other blas implementation (#1536) Zenix 2023-05-20 23:58:31 +09:00
  • 19277197d0 OpenCL: Fixes for older devices. (#1435) Henri Vasserman 2023-05-20 17:57:39 +03:00
  • 9ecb30f959 OpenCL: Fixes for older devices. (#1435) Henri Vasserman 2023-05-20 17:57:39 +03:00
  • 5612c7938f llama : define magic numbers as integer constants (#1518) (#1520) Juuso Alasuutari 2023-05-20 15:58:15 +03:00
  • 29cf5596fe llama : define magic numbers as integer constants (#1518) (#1520) Juuso Alasuutari 2023-05-20 15:58:15 +03:00
  • 55b16583fb ggml : add ggml_clamp() (#1539) Georgi Gerganov 2023-05-20 15:34:45 +03:00
  • 3de84b2606 ggml : add ggml_clamp() (#1539) Georgi Gerganov 2023-05-20 15:34:45 +03:00
  • 4d3789cf53 cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483) Johannes Gäßler 2023-05-20 14:19:28 +02:00
  • affc76edfd cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483) Johannes Gäßler 2023-05-20 14:19:28 +02:00
  • ff47d6030f Revert "feature : add blis and other BLAS implementation support (#1502)" Georgi Gerganov 2023-05-20 12:03:48 +03:00
  • ea600071cb Revert "feature : add blis and other BLAS implementation support (#1502)" Georgi Gerganov 2023-05-20 12:03:48 +03:00
  • 1a287d274c feature : add blis and other BLAS implementation support (#1502) Zenix 2023-05-20 18:02:48 +09:00
  • 07e9ace0f9 feature : add blis and other BLAS implementation support (#1502) Zenix 2023-05-20 18:02:48 +09:00
  • a207748a91 llama : add llama_init_backend() API (close #1527) Georgi Gerganov 2023-05-20 11:06:11 +03:00
  • ec2e10c444 llama : add llama_init_backend() API (close #1527) Georgi Gerganov 2023-05-20 11:06:11 +03:00
  • bc988e6762 Fix for mingw (#1462) DannyDaemonic 2023-05-20 00:40:02 -07:00
  • d2c59b8ba4 Fix for mingw (#1462) DannyDaemonic 2023-05-20 00:40:02 -07:00
  • 6d020974a3 llama : fix name shadowing and C4146 (#1526) Maxime 2023-05-20 09:22:37 +02:00
  • 503db28849 llama : fix name shadowing and C4146 (#1526) Maxime 2023-05-20 09:22:37 +02:00
  • 971a8ada6c llama : fix compile warnings in llama_set_state_data() Georgi Gerganov 2023-05-20 10:14:31 +03:00
  • 8a203f9fa1 llama : fix compile warnings in llama_set_state_data() Georgi Gerganov 2023-05-20 10:14:31 +03:00
  • a4a7d760ca ggml : fix scalar implementation of Q4_1 dot Georgi Gerganov 2023-05-20 10:13:19 +03:00
  • 4fd3e29297 ggml : fix scalar implementation of Q4_1 dot Georgi Gerganov 2023-05-20 10:13:19 +03:00
  • 744804ef91 ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508) Georgi Gerganov 2023-05-19 22:17:18 +03:00
  • 2d5db48371 ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508) Georgi Gerganov 2023-05-19 22:17:18 +03:00
  • e56594e284 tests : add missing header Georgi Gerganov 2023-05-19 21:17:28 +03:00
  • 6986c7835a tests : add missing header Georgi Gerganov 2023-05-19 21:17:28 +03:00
  • fb1d96e5c0 examples : add persistent chat (#1495) Evan Jones 2023-05-19 13:39:51 -04:00
  • 943e6081cc examples : add persistent chat (#1495) Evan Jones 2023-05-19 13:39:51 -04:00
  • 278c0ca5db main : make reverse prompt option act as a stop token in non-interactive mode (#1032) Jason McCartney 2023-05-19 10:24:59 -07:00
  • 7694b52b9a main : make reverse prompt option act as a stop token in non-interactive mode (#1032) Jason McCartney 2023-05-19 10:24:59 -07:00
  • 1917bffd34 readme : adds WizardLM to the list of supported models (#1485) David Kennedy 2023-05-19 13:16:30 -04:00
  • 79e3efb0e9 readme : adds WizardLM to the list of supported models (#1485) David Kennedy 2023-05-19 13:16:30 -04:00
  • 0ea7760796 minor : fix compile warnings Georgi Gerganov 2023-05-19 20:14:51 +03:00
  • 4b7e245adf minor : fix compile warnings Georgi Gerganov 2023-05-19 20:14:51 +03:00
  • 898d3f017e make kv_f16 the default for api users (#1517) Erik Scholz 2023-05-18 19:31:01 +02:00
  • 5ea4339273 make kv_f16 the default for api users (#1517) Erik Scholz 2023-05-18 19:31:01 +02:00
  • b341147400 Fixes #1511 lambda issue for w64devkit (mingw) (#1513) DannyDaemonic 2023-05-18 10:30:40 -07:00
  • ee9654138a Fixes #1511 lambda issue for w64devkit (mingw) (#1513) DannyDaemonic 2023-05-18 10:30:40 -07:00
  • 2ac41ed3ee Remove unused n_parts parameter (#1509) Stephan Walter 2023-05-17 22:12:01 +00:00
  • dc271c52ed Remove unused n_parts parameter (#1509) Stephan Walter 2023-05-17 22:12:01 +00:00
  • f8552af355 benchmark-matmul: Print the average of the test results (#1490) rankaiyx 2023-05-17 22:47:58 +08:00
  • c238b5873a benchmark-matmul: Print the average of the test results (#1490) rankaiyx 2023-05-17 22:47:58 +08:00
  • 96015ca83c convert.py: Support models which are stored in a single pytorch_model.bin (#1469) Tom Jobbins 2023-05-16 23:04:35 +01:00
  • 2b2646931b convert.py: Support models which are stored in a single pytorch_model.bin (#1469) Tom Jobbins 2023-05-16 23:04:35 +01:00
  • 84e0992711 ~7% faster Q5_1 AVX2 code (#1477) Ilya Kurdyukov 2023-05-17 01:36:47 +07:00
  • 42627421ec ~7% faster Q5_1 AVX2 code (#1477) Ilya Kurdyukov 2023-05-17 01:36:47 +07:00
  • 25ea9e7480 define default model path once, sync path with readme (#1366) András Salamon 2023-05-16 16:46:34 +01:00
  • 9560655409 define default model path once, sync path with readme (#1366) András Salamon 2023-05-16 16:46:34 +01:00
  • 605a090c90 Add alternate include path for openblas (#1476) sandyiscool 2023-05-16 14:00:15 +05:30
  • 2a5ee023ad Add alternate include path for openblas (#1476) sandyiscool 2023-05-16 14:00:15 +05:30
  • 4d8619e508 fix get_num_physical_cores() (#1436) zrm 2023-05-14 22:25:42 -04:00
  • 63d20469b8 fix get_num_physical_cores() (#1436) zrm 2023-05-14 22:25:42 -04:00
  • 951a74e307 benchmark-matmul: fix clang-tidy issues, report results in GFLOPS (#1458) slaren 2023-05-14 22:46:00 +02:00
  • b5c9295eef benchmark-matmul: fix clang-tidy issues, report results in GFLOPS (#1458) slaren 2023-05-14 22:46:00 +02:00