Commit Graph

  • 728d05343e Fix for #3454 (#3455) goerch 2023-10-07 06:57:01 +02:00
  • 3a716b4dae Fix for #3454 (#3455) goerch 2023-10-07 06:57:01 +02:00
  • 3226b5d74b readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
  • 1faaae8c2b readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
  • 50441e1961 server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
  • cb13d73a72 server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
  • b0465289f6 kv cache slot search improvements (#3493) Kerfuffle 2023-10-06 10:10:13 -06:00
  • 9ca79d5cbb kv cache slot search improvements (#3493) Kerfuffle 2023-10-06 10:10:13 -06:00
  • e4e3da7e61 prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
  • 0c731ca403 prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
  • d41ac162c7 parallel : add option to load external prompt file (#3416) pudepiedj 2023-10-06 14:16:38 +01:00
  • a8777ad84e parallel : add option to load external prompt file (#3416) pudepiedj 2023-10-06 14:16:38 +01:00
  • a658494d40 server : reuse llama_sample_token common util (#3494) Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
  • 97af49fa39 server : reuse llama_sample_token common util (#3494) Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
  • 1ad086601d llama : correct hparams comparison (#3446) l3utterfly 2023-10-06 18:47:59 +08:00
  • 16820a5a0d llama : correct hparams comparison (#3446) l3utterfly 2023-10-06 18:47:59 +08:00
  • c5d50453ad ci : fix xcodebuild destinations (#3491) Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
  • 04b2f4386e ci : fix xcodebuild destinations (#3491) Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
  • a45b85f5e3 convert : update Falcon script for new HF config (#3448) cebtenzzre 2023-10-05 15:00:34 -04:00
  • 48edda30ee convert : update Falcon script for new HF config (#3448) cebtenzzre 2023-10-05 15:00:34 -04:00
  • 223cf64ced build : use std::make_tuple() for compatibility with older GCC versions (#3488) Kenvix ⭐ 2023-10-06 01:16:39 +08:00
  • 45eba9369f build : use std::make_tuple() for compatibility with older GCC versions (#3488) Kenvix ⭐ 2023-10-06 01:16:39 +08:00
  • 28b8839be2 common : process escape sequences in reverse prompts (#3461) staviq 2023-10-05 18:17:29 +02:00
  • acec9eaaa9 common : process escape sequences in reverse prompts (#3461) staviq 2023-10-05 18:17:29 +02:00
  • 0cb7a718e6 CLBlast: Fix handling of on-device tensor data shibe2 2023-10-05 15:57:03 +04:00
  • e2583cbc29 CLBlast: Fix handling of on-device tensor data shibe2 2023-10-05 15:57:03 +04:00
  • 8c2bfb2501 server : fix incorrect num_tokens_predicted (#3480) Jhen-Jie Hong 2023-10-05 09:02:55 -05:00
  • e8b8d32e86 server : fix incorrect num_tokens_predicted (#3480) Jhen-Jie Hong 2023-10-05 09:02:55 -05:00
  • 594db7b27d swift : disable ACCELERATE_NEW_LAPACK (#3481) Jhen-Jie Hong 2023-10-05 09:00:07 -05:00
  • 8f3a642ec1 swift : disable ACCELERATE_NEW_LAPACK (#3481) Jhen-Jie Hong 2023-10-05 09:00:07 -05:00
  • 9a72ae1535 ci : add swift build via xcodebuild (#3482) Jhen-Jie Hong 2023-10-05 08:56:21 -05:00
  • 0745384449 ci : add swift build via xcodebuild (#3482) Jhen-Jie Hong 2023-10-05 08:56:21 -05:00
  • e3b9719f82 convert : fix Baichuan2 models by using vocab size in config.json (#3299) Kerfuffle 2023-10-04 08:20:28 -06:00
  • 019ba1dcd0 convert : fix Baichuan2 models by using vocab size in config.json (#3299) Kerfuffle 2023-10-04 08:20:28 -06:00
  • 1ded9d4793 readme : add project status link Georgi Gerganov 2023-10-04 16:50:44 +03:00
  • beabc8cfb0 readme : add project status link Georgi Gerganov 2023-10-04 16:50:44 +03:00
  • 1de48d3890 ggml : fix build after #3329 Georgi Gerganov 2023-10-04 16:25:41 +03:00
  • 0d152b37fe ggml : fix build after #3329 Georgi Gerganov 2023-10-04 16:25:41 +03:00
  • 90ee86e51d llm : add Refact model (#3329) ds5t5 2023-10-04 06:23:39 -07:00
  • f8c90cdbaa llm : add Refact model (#3329) ds5t5 2023-10-04 06:23:39 -07:00
  • 8c5a364d76 sync : ggml (conv 1d + 2d updates, UB fixes) (#3468) Georgi Gerganov 2023-10-04 15:29:58 +03:00
  • f93af02488 sync : ggml (conv 1d + 2d updates, UB fixes) (#3468) Georgi Gerganov 2023-10-04 15:29:58 +03:00
  • db43cc136f finetune : readme fix typo (#3465) Merrick Christensen 2023-10-04 00:33:13 -06:00
  • f72f8f22c9 finetune : readme fix typo (#3465) Merrick Christensen 2023-10-04 00:33:13 -06:00
  • 28524835b4 ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) Tameem 2023-10-03 23:38:19 +05:00
  • 79f34abddb ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) Tameem 2023-10-03 23:38:19 +05:00
  • 7d70c61e48 main : consistent prefix/suffix coloring (#3425) h-h-h-h 2023-10-03 20:16:15 +02:00
  • 8186242b6d main : consistent prefix/suffix coloring (#3425) h-h-h-h 2023-10-03 20:16:15 +02:00
  • 28eba94226 llama : fix session saving/loading (#3400) Georgi Gerganov 2023-10-03 21:04:01 +03:00
  • ac2219fef3 llama : fix session saving/loading (#3400) Georgi Gerganov 2023-10-03 21:04:01 +03:00
  • 96d117b3d0 llama : expose model's rope_freq_scale in the API (#3418) Alex Klinkhamer 2023-10-03 10:09:28 -07:00
  • 48be797ffb llama : expose model's rope_freq_scale in the API (#3418) Alex Klinkhamer 2023-10-03 10:09:28 -07:00
  • ae78fc4959 metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
  • f56e1baec3 metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
  • 482d162480 cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) Eve 2023-10-03 16:53:15 +00:00
  • 017efe899d cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) Eve 2023-10-03 16:53:15 +00:00
  • b1f53e2251 Work on the BPE tokenizer (#3252) goerch 2023-10-03 09:16:26 +02:00
  • ff5a3f0c09 Work on the BPE tokenizer (#3252) goerch 2023-10-03 09:16:26 +02:00
  • c988064d59 convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
  • 1c84003c08 convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
  • b404cd317d cmake : increase minimum version for add_link_options (#3444) cebtenzzre 2023-10-02 15:38:43 -04:00
  • e78f0b0d05 cmake : increase minimum version for add_link_options (#3444) cebtenzzre 2023-10-02 15:38:43 -04:00
  • 4103006018 CLBlast: Add broadcast support for matrix multiplication (#3402) shibe2 2023-10-02 23:26:15 +04:00
  • 665018c749 CLBlast: Add broadcast support for matrix multiplication (#3402) shibe2 2023-10-02 23:26:15 +04:00
  • 63d2b805d6 gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
  • 29a404a951 gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
  • 74688d7042 gguf : general usability improvements (#3409) cebtenzzre 2023-10-02 14:58:46 -04:00
  • 0fe321031a gguf : general usability improvements (#3409) cebtenzzre 2023-10-02 14:58:46 -04:00
  • 4f46a163c0 cmake : make CUDA flags more similar to the Makefile (#3420) cebtenzzre 2023-10-02 09:16:50 -04:00
  • 9476b01226 cmake : make CUDA flags more similar to the Makefile (#3420) cebtenzzre 2023-10-02 09:16:50 -04:00
  • 1d8be9d108 finetune : fix #3404 (#3437) xaedes 2023-10-02 15:15:45 +02:00
  • a03ce38455 finetune : fix #3404 (#3437) xaedes 2023-10-02 15:15:45 +02:00
  • be2cbb3651 metal : set log callback before initializing (#3427) Adrian 2023-10-02 03:49:59 -07:00
  • a847676984 metal : set log callback before initializing (#3427) Adrian 2023-10-02 03:49:59 -07:00
  • f74c19d38f cmake : fix transient definitions in find pkg (#3411) bandoti 2023-10-02 06:51:49 -03:00
  • 095231dfd3 cmake : fix transient definitions in find pkg (#3411) bandoti 2023-10-02 06:51:49 -03:00
  • f71145524c docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
  • ea55295a74 docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
  • 90ea1e92a7 infill : add new example + extend server API (#3296) vvhg1 2023-10-02 09:42:02 +02:00
  • c97f01c362 infill : add new example + extend server API (#3296) vvhg1 2023-10-02 09:42:02 +02:00
  • 30671dbce8 ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) slaren 2023-09-30 18:12:57 +02:00
  • f5ef5cfb18 ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) slaren 2023-09-30 18:12:57 +02:00
  • a18aa627fa llama.cpp : add documentation about rope_freq_base and scale values (#3401) slaren 2023-09-29 18:42:32 +02:00
  • 40e07a60f9 llama.cpp : add documentation about rope_freq_base and scale values (#3401) slaren 2023-09-29 18:42:32 +02:00
  • 635a2d7eb3 train : fix KQ_pos allocation (#3392) Georgi Gerganov 2023-09-29 19:05:18 +03:00
  • bc34dd4f5b train : fix KQ_pos allocation (#3392) Georgi Gerganov 2023-09-29 19:05:18 +03:00
  • 45896b0dc4 llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) Cebtenzzre 2023-09-29 09:48:45 -04:00
  • 2777a84be4 llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) Cebtenzzre 2023-09-29 09:48:45 -04:00
  • 6706639c45 readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
  • 0a4a4a0982 readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
  • 93527803e3 readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00
  • 569550df20 readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00
  • f93da61e4c swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
  • c71bf2c45c swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
  • 4cc4f84aea build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
  • bc39553c90 build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
  • 317196093a ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
  • 0ccfc62a96 ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
  • 9ccc98236f ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00
  • 7f1a0fe709 ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00