Commit Graph

  • faa13abb73 editorconfig : remove trailing spaces Georgi Gerganov 2023-10-17 19:52:53 +03:00
  • e74c705e15 editorconfig : remove trailing spaces Georgi Gerganov 2023-10-17 19:52:53 +03:00
  • 57fb1fe438 server : documentation of JSON return value of /completion endpoint (#3632) coezbek 2023-10-17 18:51:02 +02:00
  • 3ad1e3f1a1 server : documentation of JSON return value of /completion endpoint (#3632) coezbek 2023-10-17 18:51:02 +02:00
  • 3589fdc88d save-load-state : fix example + add ci test (#3655) Georgi Gerganov 2023-10-17 19:12:46 +03:00
  • 1142013da4 save-load-state : fix example + add ci test (#3655) Georgi Gerganov 2023-10-17 19:12:46 +03:00
  • e49cde7ded readme : add Aquila2 links (#3610) ldwang 2023-10-17 23:52:33 +08:00
  • 5fe268a4d9 readme : add Aquila2 links (#3610) ldwang 2023-10-17 23:52:33 +08:00
  • 71936d5fbe tokenizer : special token handling (#3538) staviq 2023-10-17 17:11:01 +02:00
  • 1a159553f9 tokenizer : special token handling (#3538) staviq 2023-10-17 17:11:01 +02:00
  • 37219c789f k-quants : fix quantization ranges (#3646) Georgi Gerganov 2023-10-17 09:19:28 +03:00
  • 281ef73c25 k-quants : fix quantization ranges (#3646) Georgi Gerganov 2023-10-17 09:19:28 +03:00
  • 1f9f23ea4a llava : fix tokenization to not add bos between image embeddings and user prompt (#3645) Georgi Gerganov 2023-10-16 23:58:00 +03:00
  • 940efa95fe llava : fix tokenization to not add bos between image embeddings and user prompt (#3645) Georgi Gerganov 2023-10-16 23:58:00 +03:00
  • 2d9d8e7de2 MPT : support GQA for replit-code-v1.5 (#3627) cebtenzzre 2023-10-15 02:32:06 -04:00
  • 11bff29045 MPT : support GQA for replit-code-v1.5 (#3627) cebtenzzre 2023-10-15 02:32:06 -04:00
  • dd5a356c81 Honor -ngl option for Cuda offloading in llava (#3621) M. Yusuf Sarıgöz 2023-10-14 13:52:44 +03:00
  • 11dc1091f6 Honor -ngl option for Cuda offloading in llava (#3621) M. Yusuf Sarıgöz 2023-10-14 13:52:44 +03:00
  • bfde2a4566 llama : remove n_threads from llama_decode_internal (#3614) Daniel Bevenius 2023-10-13 12:33:16 +02:00
  • 2a4bcbacea llama : remove n_threads from llama_decode_internal (#3614) Daniel Bevenius 2023-10-13 12:33:16 +02:00
  • 729c8f78ec ggml : add context enumeration functions (#3605) slaren 2023-10-13 12:23:10 +02:00
  • 424b6381c4 ggml : add context enumeration functions (#3605) slaren 2023-10-13 12:23:10 +02:00
  • 8f78e4d46e CLBlast: Fix matrix-vector multiplication (#3544) shibe2 2023-10-12 23:59:47 +04:00
  • 1e0e873c37 CLBlast: Fix matrix-vector multiplication (#3544) shibe2 2023-10-12 23:59:47 +04:00
  • d406725539 examples: support LLaVA v1.5 (multimodal model) (#3436) M. Yusuf Sarıgöz 2023-10-12 18:23:18 +03:00
  • 370359e5ba examples: support LLaVA v1.5 (multimodal model) (#3436) M. Yusuf Sarıgöz 2023-10-12 18:23:18 +03:00
  • dc9c5a37a3 docs : fix typo GOMP_CPU_AFFINITY (#3597) uint256_t 2023-10-12 22:36:16 +09:00
  • 9e24cc6e2e docs : fix typo GOMP_CPU_AFFINITY (#3597) uint256_t 2023-10-12 22:36:16 +09:00
  • 7bf0bf6231 cmake : fix add_compile_options on macOS Georgi Gerganov 2023-10-12 14:31:05 +03:00
  • d28e572c02 cmake : fix add_compile_options on macOS Georgi Gerganov 2023-10-12 14:31:05 +03:00
  • 3ee11e89e1 typo : it is --n-gpu-layers not --gpu-layers (#3592) Ian Scrivener 2023-10-12 22:10:50 +11:00
  • f3040beaab typo : it is --n-gpu-layers not --gpu-layers (#3592) Ian Scrivener 2023-10-12 22:10:50 +11:00
  • 907b41b661 ci : check if there is enough VRAM (#3596) Georgi Gerganov 2023-10-12 13:44:56 +03:00
  • 1a8c8795d6 ci : check if there is enough VRAM (#3596) Georgi Gerganov 2023-10-12 13:44:56 +03:00
  • e47676d9e3 server : add completion mode (no chat) (#3582) Aarni Koskela 2023-10-12 15:51:53 +09:00
  • b016596d90 server : add completion mode (no chat) (#3582) Aarni Koskela 2023-10-12 15:51:53 +09:00
  • 758d0ddfca prompts : add mnemonics.txt Georgi Gerganov 2023-10-12 09:35:19 +03:00
  • 6b3ae4da92 prompts : add mnemonics.txt Georgi Gerganov 2023-10-12 09:35:19 +03:00
  • a247c7af1c server : fix kv cache management (#3588) Georgi Gerganov 2023-10-12 09:29:04 +03:00
  • 57dd55e2c7 server : fix kv cache management (#3588) Georgi Gerganov 2023-10-12 09:29:04 +03:00
  • 47ae6b2fa3 main : fix session loading bug (#3400) Georgi Gerganov 2023-10-11 23:55:08 +03:00
  • b8fe4b5cc9 main : fix session loading bug (#3400) Georgi Gerganov 2023-10-11 23:55:08 +03:00
  • 132406fe03 server : add parameter -tb N, --threads-batch N (#3584) Michael Coppola 2023-10-11 15:42:22 -04:00
  • a8bdd65525 server : add parameter -tb N, --threads-batch N (#3584) Michael Coppola 2023-10-11 15:42:22 -04:00
  • ecd831a6b8 common : fix mirostat state when using multiple sequences (#3543) Kerfuffle 2023-10-11 13:35:46 -06:00
  • 70c29da118 common : fix mirostat state when using multiple sequences (#3543) Kerfuffle 2023-10-11 13:35:46 -06:00
  • f11fd81fbd batched : add bench tool (#3545) Georgi Gerganov 2023-10-11 21:25:33 +03:00
  • 8c70a5ff25 batched : add bench tool (#3545) Georgi Gerganov 2023-10-11 21:25:33 +03:00
  • dcdafa74c6 examples : add batched.swift + improve CI for swift (#3562) Zane Shannon 2023-10-11 04:14:05 -07:00
  • 24ba3d829e examples : add batched.swift + improve CI for swift (#3562) Zane Shannon 2023-10-11 04:14:05 -07:00
  • a637869df6 Add MPT model to supported models in README.md (#3574) Galunid 2023-10-11 01:02:49 +02:00
  • 9f6ede19f3 Add MPT model to supported models in README.md (#3574) Galunid 2023-10-11 01:02:49 +02:00
  • 4e6e75e98e Minor improvements in GPT2 tokenizer (#3567) goerch 2023-10-10 18:59:52 +02:00
  • 233fc1c69f Minor improvements in GPT2 tokenizer (#3567) goerch 2023-10-10 18:59:52 +02:00
  • 8994c485e9 readme : add bloom (#3570) Xingchen Song(宋星辰) 2023-10-11 00:28:50 +08:00
  • c5b49360d0 readme : add bloom (#3570) Xingchen Song(宋星辰) 2023-10-11 00:28:50 +08:00
  • 5f0a4ad1c2 llm : add bloom models (#3553) Xingchen Song(宋星辰) 2023-10-10 22:48:21 +08:00
  • 02d2875def llm : add bloom models (#3553) Xingchen Song(宋星辰) 2023-10-10 22:48:21 +08:00
  • d49c1c5b2d swift : improvements and fixes (#3564) Jhen-Jie Hong 2023-10-10 06:31:13 -05:00
  • 0aa6595ae0 swift : improvements and fixes (#3564) Jhen-Jie Hong 2023-10-10 06:31:13 -05:00
  • fe2f22f1e0 llm : add MPT support (#3417) Jan Ploski 2023-10-10 09:50:23 +02:00
  • f5f9121de1 llm : add MPT support (#3417) Jan Ploski 2023-10-10 09:50:23 +02:00
  • b1144203e3 infill. : fix tokenization (#3508) vvhg1 2023-10-10 09:31:21 +02:00
  • 11ea5c7d96 infill. : fix tokenization (#3508) vvhg1 2023-10-10 09:31:21 +02:00
  • ff8ee10bfa ggml-alloc : fix assert in debug builds (#3555) slaren 2023-10-09 14:44:58 +02:00
  • 95bd60a0a6 ggml-alloc : fix assert in debug builds (#3555) slaren 2023-10-09 14:44:58 +02:00
  • 2743064b15 refact : fix convert script + zero out KV cache to avoid nans (#3523) Georgi Gerganov 2023-10-09 14:32:17 +03:00
  • fcca0a7004 refact : fix convert script + zero out KV cache to avoid nans (#3523) Georgi Gerganov 2023-10-09 14:32:17 +03:00
  • 6ac45c3397 metal : do not use mul_mm kernels when ne00 < 64 (#3542) Georgi Gerganov 2023-10-09 14:28:27 +03:00
  • dcc09d2596 metal : do not use mul_mm kernels when ne00 < 64 (#3542) Georgi Gerganov 2023-10-09 14:28:27 +03:00
  • 78b3d9b796 sync : ggml (ggml-backend) (#3548) Georgi Gerganov 2023-10-08 20:19:14 +03:00
  • db3abcc114 sync : ggml (ggml-backend) (#3548) Georgi Gerganov 2023-10-08 20:19:14 +03:00
  • 6a0de063d6 ci : add Zig CI/CD and fix build (#2996) Matheus C. França 2023-10-08 10:59:20 -03:00
  • eee42c670e ci : add Zig CI/CD and fix build (#2996) Matheus C. França 2023-10-08 10:59:20 -03:00
  • 61bd777112 api_like_OAI.py : compat with Microsoft Guidance (#2746) Ryder Wishart 2023-10-08 03:55:58 -07:00
  • 8e6716a102 api_like_OAI.py : compat with Microsoft Guidance (#2746) Ryder Wishart 2023-10-08 03:55:58 -07:00
  • 4e58bb2f8d api_like_OAI.py : simplify function (#2796) arcrank 2023-10-08 06:52:57 -04:00
  • 9c38d181d4 api_like_OAI.py : simplify function (#2796) arcrank 2023-10-08 06:52:57 -04:00
  • 34388801da k-quants : fix comments about block sizing (#3499) Johannes Rudolph 2023-10-08 12:21:19 +02:00
  • a1202a31ed k-quants : fix comments about block sizing (#3499) Johannes Rudolph 2023-10-08 12:21:19 +02:00
  • cd058b6357 ci : enable on obj-c changes + fix metal build (#3540) Georgi Gerganov 2023-10-08 11:24:50 +03:00
  • 94e502dfb7 ci : enable on obj-c changes + fix metal build (#3540) Georgi Gerganov 2023-10-08 11:24:50 +03:00
  • f518e5c7e3 zig : fix build by introducing train.cpp (#3539) Luo Tian 2023-10-08 16:24:01 +08:00
  • 7d8b24932f zig : fix build by introducing train.cpp (#3539) Luo Tian 2023-10-08 16:24:01 +08:00
  • d596ae762a metal : support MTLGPUFamily < Apple7, formatting, style (#3524) Georgi Gerganov 2023-10-08 10:01:53 +03:00
  • b0ec5218c3 metal : support MTLGPUFamily < Apple7, formatting, style (#3524) Georgi Gerganov 2023-10-08 10:01:53 +03:00
  • 418c7c4e56 llama : fix missing break in Persimmon arch case statements (#3535) Kerfuffle 2023-10-07 23:22:17 -06:00
  • 63d3b06a43 llama : fix missing break in Persimmon arch case statements (#3535) Kerfuffle 2023-10-07 23:22:17 -06:00
  • 7b49ee2537 Fix trying to strip newline from empty prompt and cfg prompt file content (#3534) Kerfuffle 2023-10-07 15:31:41 -06:00
  • a16e89cec8 Fix trying to strip newline from empty prompt and cfg prompt file content (#3534) Kerfuffle 2023-10-07 15:31:41 -06:00
  • fb1d64727e gguf.py : fix CI for publishing GGUF package (#3532) M. Yusuf Sarıgöz 2023-10-07 22:14:10 +03:00
  • 4d03833211 gguf.py : fix CI for publishing GGUF package (#3532) M. Yusuf Sarıgöz 2023-10-07 22:14:10 +03:00
  • a802debeb6 py : change version of numpy requirement to 1.24.4 (#3515) Tom C 2023-10-07 02:56:15 -07:00
  • c47066d833 py : change version of numpy requirement to 1.24.4 (#3515) Tom C 2023-10-07 02:56:15 -07:00
  • 80735cb7bd quantize : fail fast on write errors (#3521) cebtenzzre 2023-10-07 04:41:52 -04:00
  • f1782c68de quantize : fail fast on write errors (#3521) cebtenzzre 2023-10-07 04:41:52 -04:00
  • bff47ce69b metal : support default.metallib load & reuse code for swift package (#3522) Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
  • c26765a0a1 metal : support default.metallib load & reuse code for swift package (#3522) Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
  • afaaf1849d llm : support Adept Persimmon 8B (#3410) Phillip Kravtsov 2023-10-07 00:12:43 -07:00
  • 0e797c2fc5 llm : support Adept Persimmon 8B (#3410) Phillip Kravtsov 2023-10-07 00:12:43 -07:00