Commit Graph

  • 6b3c1a820c ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852) Georgi Gerganov 2023-08-28 14:24:53 +03:00
  • 35feac6560 ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852) Georgi Gerganov 2023-08-28 14:24:53 +03:00
  • adddb70f08 CUDA: fix RoPE asserts, block sizes (#2833) Johannes Gäßler 2023-08-28 13:23:55 +02:00
  • 92b1bbd2ec CUDA: fix RoPE asserts, block sizes (#2833) Johannes Gäßler 2023-08-28 13:23:55 +02:00
  • 981286076a llama.h : add missing struct keyword for C compat in callback type (#2847) igarnier 2023-08-28 10:19:59 +02:00
  • dd0dc366da llama.h : add missing struct keyword for C compat in callback type (#2847) igarnier 2023-08-28 10:19:59 +02:00
  • 426b370c50 metal : fix memory leak (#2762) Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • f55538c3cc metal : fix memory leak (#2762) Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • 4441245308 quantize : make output filename optional again (#2823) Cebtenzzre 2023-08-28 02:32:25 -04:00
  • ebcee207b6 quantize : make output filename optional again (#2823) Cebtenzzre 2023-08-28 02:32:25 -04:00
  • 5ad04c5018 devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • 3e8ff47af6 devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • e4cc663c30 gguf : fix strings to not be null-terminated (#2839) Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • 103cfafc77 gguf : fix strings to not be null-terminated (#2839) Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • 0158aeeee9 llama : fix MPI threads (close #2827) Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • c10704d01e llama : fix MPI threads (close #2827) Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • 2f0ae17b63 examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) Olivier Chafik 2023-08-27 15:13:31 +01:00
  • 230d46c723 examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) Olivier Chafik 2023-08-27 15:13:31 +01:00
  • f5cdbffd3d llama : speedup tokenization (#2831) Kawrakow 2023-08-27 16:50:33 +03:00
  • 463173a6c0 llama : speedup tokenization (#2831) Kawrakow 2023-08-27 16:50:33 +03:00
  • a6d2b4bab7 falcon : fix CUDA inference by making K and Q contiguous (#2830) Georgi Gerganov 2023-08-27 16:40:48 +03:00
  • eaa13a48ff falcon : fix CUDA inference by making K and Q contiguous (#2830) Georgi Gerganov 2023-08-27 16:40:48 +03:00
  • a40c1d87ff readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +03:00
  • da7455d046 readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +03:00
  • 074d8ee8c4 scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +03:00
  • 25423e9185 scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +03:00
  • ab18bc5e87 k_quants tuning for Falcon-7b (#2816) Kawrakow 2023-08-27 15:19:59 +03:00
  • a6d1189fdd k_quants tuning for Falcon-7b (#2816) Kawrakow 2023-08-27 15:19:59 +03:00
  • 5a7aaa5f74 readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +03:00
  • c48c5bb0b0 readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +03:00
  • 11f0bd7499 gguf : add 64-bit support (GGUF v2) (#2821) Georgi Gerganov 2023-08-27 14:19:54 +03:00
  • d0cee0d36d gguf : add 64-bit support (GGUF v2) (#2821) Georgi Gerganov 2023-08-27 14:19:54 +03:00
  • 926dbcbaab llama : more tokenizer fixes (#2810) Georgi Gerganov 2023-08-27 14:19:19 +03:00
  • edd4c14817 llama : more tokenizer fixes (#2810) Georgi Gerganov 2023-08-27 14:19:19 +03:00
  • 0c3eafbe0e ggml : detect SSSE3 (#2825) Przemysław Pawełczyk 2023-08-27 10:10:25 +02:00
  • 1591e2e590 ggml : detect SSSE3 (#2825) Przemysław Pawełczyk 2023-08-27 10:10:25 +02:00
  • 699ba791e5 ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +02:00
  • 789c8c945a ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +02:00
  • f2256c49e0 server : add /detokenize endpoint (#2802) Bruce MacDonald 2023-08-26 16:11:45 -07:00
  • c1ac54b77a server : add /detokenize endpoint (#2802) Bruce MacDonald 2023-08-26 16:11:45 -07:00
  • a84214bd70 convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -06:00
  • 730d9c681e convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -06:00
  • e604afedc8 llama : use Unicode Escape Sequence to replace encoded characters (#2814) Tim Miller 2023-08-27 03:27:07 +09:00
  • c7d92e6dfe llama : use Unicode Escape Sequence to replace encoded characters (#2814) Tim Miller 2023-08-27 03:27:07 +09:00
  • 10f684465a flake.nix : add rocm support and cleanup (#2808) Tungsten842 2023-08-26 20:19:44 +02:00
  • 61d1a2895e flake.nix : add rocm support and cleanup (#2808) Tungsten842 2023-08-26 20:19:44 +02:00
  • 03eea5f437 llama : move #includes out of _GNU_SOURCE conditional (#2817) Cebtenzzre 2023-08-26 14:17:51 -04:00
  • 741ca7dd1c llama : move #includes out of _GNU_SOURCE conditional (#2817) Cebtenzzre 2023-08-26 14:17:51 -04:00
  • 92a41f322b main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528) Dr. Tom Murphy VII Ph.D 2023-08-26 14:12:56 -04:00
  • 72f895c923 main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528) Dr. Tom Murphy VII Ph.D 2023-08-26 14:12:56 -04:00
  • ea11e400d7 llama : use std::abs in llama_sample_tail_free (#2800) Cebtenzzre 2023-08-26 12:53:52 -04:00
  • 50526f37eb llama : use std::abs in llama_sample_tail_free (#2800) Cebtenzzre 2023-08-26 12:53:52 -04:00
  • d821a8f3f7 k-quants : remove unnecessary tensor shape restrictions (#2811) Georgi Gerganov 2023-08-26 17:37:35 +03:00
  • 04f4b1eb10 k-quants : remove unnecessary tensor shape restrictions (#2811) Georgi Gerganov 2023-08-26 17:37:35 +03:00
  • 9b782d829a Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) Kawrakow 2023-08-26 17:27:49 +03:00
  • 7592375403 Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) Kawrakow 2023-08-26 17:27:49 +03:00
  • 756deeaf3c Fix HellaSwag (#2805) Kawrakow 2023-08-26 16:48:53 +03:00
  • 771551a793 Fix HellaSwag (#2805) Kawrakow 2023-08-26 16:48:53 +03:00
  • 3f42f52f01 flake : build llama.cpp on Intel with nix (#2795) Volodymyr Vitvitskyi 2023-08-26 14:25:39 +01:00
  • f305bad11e flake : build llama.cpp on Intel with nix (#2795) Volodymyr Vitvitskyi 2023-08-26 14:25:39 +01:00
  • ac6575162d Handle null rope scaling value (#2793) Nigel Bosch 2023-08-26 07:11:17 -05:00
  • a2ca4e9de9 Handle null rope scaling value (#2793) Nigel Bosch 2023-08-26 07:11:17 -05:00
  • 976b621020 Fix spm whitespaces (#2806) klosax 2023-08-26 13:45:53 +02:00
  • 2ba83c8685 Fix spm whitespaces (#2806) klosax 2023-08-26 13:45:53 +02:00
  • 7ede4319fc examples : skip unnecessary external lib in server README.md how-to (#2804) lon 2023-08-26 10:07:43 +02:00
  • bae5c5f679 examples : skip unnecessary external lib in server README.md how-to (#2804) lon 2023-08-26 10:07:43 +02:00
  • f19ed06ed0 llama : fix struct decl (#2790) Marcus Dunn 2023-08-25 09:17:15 -07:00
  • 232caf3c15 llama : fix struct decl (#2790) Marcus Dunn 2023-08-25 09:17:15 -07:00
  • 198140c2aa Faster perplexity computation (#2786) Kawrakow 2023-08-25 19:05:02 +03:00
  • d046dcee08 Faster perplexity computation (#2786) Kawrakow 2023-08-25 19:05:02 +03:00
  • 3e0b38e027 llama : add llama_beam_search() (#2267) Matt Pulver 2023-08-25 11:18:48 -04:00
  • c82742ac9c llama : add llama_beam_search() (#2267) Matt Pulver 2023-08-25 11:18:48 -04:00
  • 8ef2a0c9d3 convert.py : Get rope scale from HuggingFace models (#2772) Nigel Bosch 2023-08-25 09:41:52 -05:00
  • 28b2c996ca convert.py : Get rope scale from HuggingFace models (#2772) Nigel Bosch 2023-08-25 09:41:52 -05:00
  • 89e4a4461e llama-bench : add model sizes (#2771) slaren 2023-08-25 15:16:19 +02:00
  • 154725c543 llama-bench : add model sizes (#2771) slaren 2023-08-25 15:16:19 +02:00
  • e20b657ffb convert.py : export rope freq_base when converting CodeLlama from an HF model (#2773) slaren 2023-08-25 14:08:53 +02:00
  • 12e2e33a97 convert.py : export rope freq_base when converting CodeLlama from an HF model (#2773) slaren 2023-08-25 14:08:53 +02:00
  • b06380dcc0 server : display token probabilities in the UI (#2489) Jhen-Jie Hong 2023-08-25 18:32:45 +08:00
  • 29674ab4e8 server : display token probabilities in the UI (#2489) Jhen-Jie Hong 2023-08-25 18:32:45 +08:00
  • 6224f81799 ci : pip install gguf in editable mode (#2782) Georgi Gerganov 2023-08-25 13:03:25 +03:00
  • 5439a0ab57 ci : pip install gguf in editable mode (#2782) Georgi Gerganov 2023-08-25 13:03:25 +03:00
  • 08a1012230 gguf : export objects to user code (#2780) M. Yusuf Sarıgöz 2023-08-25 12:43:41 +03:00
  • 8194cd8772 gguf : export objects to user code (#2780) M. Yusuf Sarıgöz 2023-08-25 12:43:41 +03:00
  • 984b7495ed ROCm Port (#1087) Henri Vasserman 2023-08-25 12:09:42 +03:00
  • 6bbc598a63 ROCm Port (#1087) Henri Vasserman 2023-08-25 12:09:42 +03:00
  • 40c8c6dd6f cuda : add RoPE kernel for mode == 2 (NeoX) (#2760) Georgi Gerganov 2023-08-25 11:55:59 +03:00
  • 3f460a2b72 cuda : add RoPE kernel for mode == 2 (NeoX) (#2760) Georgi Gerganov 2023-08-25 11:55:59 +03:00
  • a67ec14fe9 gguf : make gguf pip-installable M. Yusuf Sarıgöz 2023-08-25 09:26:05 +03:00
  • 87e3733f24 gguf : make gguf pip-installable M. Yusuf Sarıgöz 2023-08-25 09:26:05 +03:00
  • 800ef93db4 ggml-alloc : enlarge size of parse_seq (#2776) Shouzheng Liu 2023-08-25 01:58:00 -04:00
  • b91ad7f461 ggml-alloc : enlarge size of parse_seq (#2776) Shouzheng Liu 2023-08-25 01:58:00 -04:00
  • 1e8200c3d0 Added enum to llama_token_get_type return type (#2774) Marcus Dunn 2023-08-24 14:49:30 -07:00
  • 2e5f70a25f Added enum to llama_token_get_type return type (#2774) Marcus Dunn 2023-08-24 14:49:30 -07:00
  • 506fd81d05 convert.py : try to determine n_ctx automatically for CodeLlama (#2770) slaren 2023-08-24 21:10:39 +02:00
  • d0f77b1353 convert.py : try to determine n_ctx automatically for CodeLlama (#2770) slaren 2023-08-24 21:10:39 +02:00
  • 9818be3377 gguf : add rope_freq_base parameter for CodeLlama (#2769) slaren 2023-08-24 20:04:05 +02:00
  • 0d3094f0c7 gguf : add rope_freq_base parameter for CodeLlama (#2769) slaren 2023-08-24 20:04:05 +02:00
  • 9042737101 falcon : write file type Georgi Gerganov 2023-08-24 19:58:30 +03:00
  • 01f2224682 falcon : write file type Georgi Gerganov 2023-08-24 19:58:30 +03:00