Commit Graph

  • f2098016d1 gguf-py : export chat templates (#4125) slaren 2023-11-19 11:10:52 +01:00
  • e937066420 gguf-py : export chat templates (#4125) slaren 2023-11-19 11:10:52 +01:00
  • e74f9765fd tokenize example: Respect normal add BOS token behavior (#4126) Kerfuffle 2023-11-18 14:48:17 -07:00
  • 28a2e6e7d4 tokenize example: Respect normal add BOS token behavior (#4126) Kerfuffle 2023-11-18 14:48:17 -07:00
  • 5c0f1b36df scripts : Remove missed baichuan convert script (#4127) Galunid 2023-11-18 21:08:33 +01:00
  • 0b5c3b0457 scripts : Remove missed baichuan convert script (#4127) Galunid 2023-11-18 21:08:33 +01:00
  • fbaceac7fb Clean up ggml-cuda.cu warnings when compiling with clang (for ROCM) (#4124) Kerfuffle 2023-11-18 08:11:18 -07:00
  • 2923f17f6f Clean up ggml-cuda.cu warnings when compiling with clang (for ROCM) (#4124) Kerfuffle 2023-11-18 08:11:18 -07:00
  • 0313e81917 llama : increase max nodes (#4115) slaren 2023-11-17 20:39:11 +01:00
  • bbecf3f415 llama : increase max nodes (#4115) slaren 2023-11-17 20:39:11 +01:00
  • 30c43147f4 build : support ppc64le build for make and CMake (#3963) Roger Meier 2023-11-17 17:11:23 +01:00
  • 8e9361089d build : support ppc64le build for make and CMake (#3963) Roger Meier 2023-11-17 17:11:23 +01:00
  • ab62ba9cc4 tokenize : fix trailing whitespace Georgi Gerganov 2023-11-17 18:01:38 +02:00
  • 5ad387e994 tokenize : fix trailing whitespace Georgi Gerganov 2023-11-17 18:01:38 +02:00
  • 79bfa9dded examples : add tokenize (#4039) zakkor 2023-11-17 17:36:44 +02:00
  • 2fa02b4b3d examples : add tokenize (#4039) zakkor 2023-11-17 17:36:44 +02:00
  • db3fa2c49f convert : use 'model' value if it exists. This allows karpathy/tinyllamas to load (#4089) Don Mahurin 2023-11-17 07:32:34 -08:00
  • 2ab0707acb convert : use 'model' value if it exists. This allows karpathy/tinyllamas to load (#4089) Don Mahurin 2023-11-17 07:32:34 -08:00
  • 7a081b9438 py : Falcon HF compatibility (#4104) John 2023-11-17 16:24:30 +01:00
  • 11173c92d6 py : Falcon HF compatibility (#4104) John 2023-11-17 16:24:30 +01:00
  • 6d2c56076d common : improve yaml log escaping (#4080) Jannis Schönleber 2023-11-17 16:24:07 +01:00
  • 9e87ef60e1 common : improve yaml log escaping (#4080) Jannis Schönleber 2023-11-17 16:24:07 +01:00
  • 30239ff94e llava : fix compilation warning that fread return value is not used (#4069) Huawei Lin 2023-11-17 10:22:56 -05:00
  • c7cce1246e llava : fix compilation warning that fread return value is not used (#4069) Huawei Lin 2023-11-17 10:22:56 -05:00
  • 0ab76040dc py : remove superfluous import statements (#4076) Jiří Podivín 2023-11-17 16:20:53 +01:00
  • f7d5e97542 py : remove superfluous import statements (#4076) Jiří Podivín 2023-11-17 16:20:53 +01:00
  • 4bd75da289 train : move number of gpu layers argument parsing to common/train.cpp (#4074) Jiří Podivín 2023-11-17 16:19:16 +01:00
  • ba4cf5c0bf train : move number of gpu layers argument parsing to common/train.cpp (#4074) Jiří Podivín 2023-11-17 16:19:16 +01:00
  • 1c64c8d28f llama : add functions to get the model's metadata (#4013) slaren 2023-11-17 16:17:37 +01:00
  • e85bb1a8e7 llama : add functions to get the model's metadata (#4013) slaren 2023-11-17 16:17:37 +01:00
  • 07974a89cb finetune : speed-up ggml_compute_forward_out_prod_f32 via BLAS (#4079) gwjr 2023-11-17 14:48:19 +00:00
  • 3e916a07ac finetune : speed-up ggml_compute_forward_out_prod_f32 via BLAS (#4079) gwjr 2023-11-17 14:48:19 +00:00
  • d1998bf20e finetune : zero the loraB initial vectors (#4082) Andrew Godfrey 2023-11-17 02:23:11 -08:00
  • 947f64f163 finetune : zero the loraB initial vectors (#4082) Andrew Godfrey 2023-11-17 02:23:11 -08:00
  • bc94b4e114 cuda : get_row_rounding F32 (#4095) Andrew Godfrey 2023-11-17 00:01:15 -08:00
  • b83e149ec6 cuda : get_row_rounding F32 (#4095) Andrew Godfrey 2023-11-17 00:01:15 -08:00
  • d52635a83a llama : fix data units (#4101) Georgi Gerganov 2023-11-17 10:00:15 +02:00
  • 4f447a4833 llama : fix data units (#4101) Georgi Gerganov 2023-11-17 10:00:15 +02:00
  • 9b2223dc12 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) Kerfuffle 2023-11-16 19:14:37 -07:00
  • 91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) Kerfuffle 2023-11-16 19:14:37 -07:00
  • da7ac34c82 gguf : fix potential infinite loops while parsing (#4100) texmex76 2023-11-16 16:01:48 +01:00
  • 8da46278e1 gguf : fix potential infinite loops while parsing (#4100) texmex76 2023-11-16 16:01:48 +01:00
  • 239225dee6 llama : restore prefix space in llama tokenizer (#4081) Jared Van Bortel 2023-11-15 11:34:47 -05:00
  • a6fc554e26 llama : restore prefix space in llama tokenizer (#4081) Jared Van Bortel 2023-11-15 11:34:47 -05:00
  • 39368e2a57 ggml-cuda : increase max graph size (#4084) slaren 2023-11-15 13:58:13 +01:00
  • 1cf2850d52 ggml-cuda : increase max graph size (#4084) slaren 2023-11-15 13:58:13 +01:00
  • 98080efe1d Fix MacOS Sonoma model quantization (#4052) Michael Potter 2023-11-14 09:34:41 -08:00
  • 6bb4908a17 Fix MacOS Sonoma model quantization (#4052) Michael Potter 2023-11-14 09:34:41 -08:00
  • d200fc170a stablelm : StableLM support (#3586) Galunid 2023-11-14 11:17:12 +01:00
  • 36eed0c42c stablelm : StableLM support (#3586) Galunid 2023-11-14 11:17:12 +01:00
  • 74458682c1 convert.py: also look for plain model.safetensors (#4043) afrideva 2023-11-13 17:03:40 -08:00
  • b46d12f86d convert.py: also look for plain model.safetensors (#4043) afrideva 2023-11-13 17:03:40 -08:00
  • a6a3e84ee6 llava : fix regression for square images in #3613 (#4056) M. Yusuf Sarıgöz 2023-11-13 18:20:52 +03:00
  • bd90eca237 llava : fix regression for square images in #3613 (#4056) M. Yusuf Sarıgöz 2023-11-13 18:20:52 +03:00
  • 5c0b825f99 ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060) Georgi Gerganov 2023-11-13 16:55:52 +02:00
  • 3d68f364f1 ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060) Georgi Gerganov 2023-11-13 16:55:52 +02:00
  • 5940637098 readme : update hot topics Georgi Gerganov 2023-11-13 14:18:08 +02:00
  • c049b37d7b readme : update hot topics Georgi Gerganov 2023-11-13 14:18:08 +02:00
  • ddea57dbe3 sync : ggml (backend v2) (#3912) Georgi Gerganov 2023-11-13 14:16:23 +02:00
  • 4760e7cc0b sync : ggml (backend v2) (#3912) Georgi Gerganov 2023-11-13 14:16:23 +02:00
  • 14f28a6b67 Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041) Kerfuffle 2023-11-13 01:58:15 -07:00
  • bb50a792ec Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041) Kerfuffle 2023-11-13 01:58:15 -07:00
  • beec62fef1 gguf-py: gguf_writer: Use bytearray to build metadata (#4051) Kerfuffle 2023-11-12 16:39:37 -07:00
  • 21fd874c8d gguf-py: gguf_writer: Use bytearray to build metadata (#4051) Kerfuffle 2023-11-12 16:39:37 -07:00
  • a05fccf374 Fix some documentation typos/grammar mistakes (#4032) Richard Kiss 2023-11-11 22:04:58 -08:00
  • 532dd74e38 Fix some documentation typos/grammar mistakes (#4032) Richard Kiss 2023-11-11 22:04:58 -08:00
  • 23591d255f Fix gguf-convert-endian script (#4037) M. Yusuf Sarıgöz 2023-11-11 18:35:31 +03:00
  • e86fc56f75 Fix gguf-convert-endian script (#4037) M. Yusuf Sarıgöz 2023-11-11 18:35:31 +03:00
  • 3bc75f09cf server : fix crash when prompt exceeds context size (#3996) Alexey Parfenov 2023-11-11 05:48:21 +00:00
  • d96ca7ded7 server : fix crash when prompt exceeds context size (#3996) Alexey Parfenov 2023-11-11 05:48:21 +00:00
  • 1807642822 gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) Kerfuffle 2023-11-10 22:04:50 -07:00
  • 34b0a08207 gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) Kerfuffle 2023-11-10 22:04:50 -07:00
  • 3d2c04640a server : allow continue edit on completion mode (#3950) Jhen-Jie Hong 2023-11-11 06:49:33 +08:00
  • 4a4fd3eefa server : allow continue edit on completion mode (#3950) Jhen-Jie Hong 2023-11-11 06:49:33 +08:00
  • 192c083d4d Unbreak persimmon after #3837 (#4010) Galunid 2023-11-10 14:24:54 +01:00
  • df9d1293de Unbreak persimmon after #3837 (#4010) Galunid 2023-11-10 14:24:54 +01:00
  • 96c505aad9 scripts: Generalize convert scripts (#3838) Galunid 2023-11-09 11:09:29 +01:00
  • a75fa576ab scripts: Generalize convert scripts (#3838) Galunid 2023-11-09 11:09:29 +01:00
  • a87e007c54 server : add min_p param (#3877) Mihai 2023-11-09 04:00:34 +02:00
  • 57ad015dc3 server : add min_p param (#3877) Mihai 2023-11-09 04:00:34 +02:00
  • a137c38e78 ggml-alloc : fix backend assignments of views (#3982) slaren 2023-11-08 13:15:14 +01:00
  • 875fb42871 ggml-alloc : fix backend assignments of views (#3982) slaren 2023-11-08 13:15:14 +01:00
  • 63cc16434a gguf : track writer state, free unneeded tensors, cleanup (#3871) Jared Van Bortel 2023-11-07 12:43:04 -05:00
  • 0a7c980b6f gguf : track writer state, free unneeded tensors, cleanup (#3871) Jared Van Bortel 2023-11-07 12:43:04 -05:00
  • 982183a4ec make : do not add linker flags when compiling static llava lib (#3977) Georgi Gerganov 2023-11-07 19:25:32 +02:00
  • 413503d4b9 make : do not add linker flags when compiling static llava lib (#3977) Georgi Gerganov 2023-11-07 19:25:32 +02:00
  • 8677eb0f72 ggml : fix backward rope after YaRN (#3974) xaedes 2023-11-07 09:04:51 +01:00
  • e9c1cecb9d ggml : fix backward rope after YaRN (#3974) xaedes 2023-11-07 09:04:51 +01:00
  • 665d8aca4f Use params when loading models in llava-cli (#3976) Matthew Tejo 2023-11-06 23:43:59 -08:00
  • 54b4df8886 Use params when loading models in llava-cli (#3976) Matthew Tejo 2023-11-06 23:43:59 -08:00
  • 9d37264c91 cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946) Meng Zhang 2023-11-06 22:49:08 -08:00
  • 46876d2a2c cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946) Meng Zhang 2023-11-06 22:49:08 -08:00
  • 8e4706759b llava : expose as a shared library for downstream projects (#3613) Damian Stewart 2023-11-06 22:36:23 +01:00
  • 381efbf480 llava : expose as a shared library for downstream projects (#3613) Damian Stewart 2023-11-06 22:36:23 +01:00
  • fde0884e08 ggml-cuda : fix f16 mul mat (#3961) slaren 2023-11-05 18:45:16 +01:00
  • 2833a6f63c ggml-cuda : fix f16 mul mat (#3961) slaren 2023-11-05 18:45:16 +01:00
  • f8a8f3e839 Allow common process_escapes to handle \x sequences (#3928) Kerfuffle 2023-11-05 10:06:06 -07:00
  • d9ccce2e33 Allow common process_escapes to handle \x sequences (#3928) Kerfuffle 2023-11-05 10:06:06 -07:00
  • 7d96cd650a server : fix typo for --alias shortcut from -m to -a (#3958) Thái Hoàng Tâm 2023-11-05 23:15:27 +07:00
  • bb60fd0bf6 server : fix typo for --alias shortcut from -m to -a (#3958) Thái Hoàng Tâm 2023-11-05 23:15:27 +07:00