Commit Graph

  • d5a410e855 CUDA: fixed redundant value dequantization (#4809) Johannes Gäßler 2024-01-07 17:24:08 +01:00
  • ec08b3e86f llama : remove unused vars (#4796) Georgi Gerganov 2024-01-07 14:29:36 +02:00
  • 9dede37d81 llama : remove unused vars (#4796) Georgi Gerganov 2024-01-07 14:29:36 +02:00
  • 3a96073b59 llama : remove redundant GQA check (#4796) Georgi Gerganov 2024-01-07 11:21:53 +02:00
  • 3c36213df8 llama : remove redundant GQA check (#4796) Georgi Gerganov 2024-01-07 11:21:53 +02:00
  • 30df691a96 llama.swiftui : use llama.cpp as SPM package (#4804) Alex Azarov 2024-01-07 09:20:50 +01:00
  • 72d8407b36 llama.swiftui : use llama.cpp as SPM package (#4804) Alex Azarov 2024-01-07 09:20:50 +01:00
  • 52b664aece llama : print tensor meta for debugging Georgi Gerganov 2024-01-07 09:50:31 +02:00
  • d117d4dc5d llama : print tensor meta for debugging Georgi Gerganov 2024-01-07 09:50:31 +02:00
  • 8c36aaf5a8 llama.swiftui : add visionOS target (#4805) Alex Azarov 2024-01-07 08:46:55 +01:00
  • 3418c03ecc llama.swiftui : add visionOS target (#4805) Alex Azarov 2024-01-07 08:46:55 +01:00
  • 5391345fcc ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (#4787) Konstantin Zhuravlyov 2024-01-07 01:52:42 -05:00
  • 63ee677efd ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (#4787) Konstantin Zhuravlyov 2024-01-07 01:52:42 -05:00
  • 003f85d7ea server : fix n_predict check (#4798) Georgi Gerganov 2024-01-07 08:45:26 +02:00
  • 67984921a7 server : fix n_predict check (#4798) Georgi Gerganov 2024-01-07 08:45:26 +02:00
  • 34d18eff4c llama.swiftui : use correct pointer for llama_token_eos (#4797) Daniel Illescas Romero 2024-01-06 16:12:59 +01:00
  • c75ca5d96f llama.swiftui : use correct pointer for llama_token_eos (#4797) Daniel Illescas Romero 2024-01-06 16:12:59 +01:00
  • 33c9d849fd examples : improve base-translate.sh script (#4783) Georgi Gerganov 2024-01-06 11:40:24 +02:00
  • 96e80dabc6 examples : improve base-translate.sh script (#4783) Georgi Gerganov 2024-01-06 11:40:24 +02:00
  • b52357162d cmake : check for openblas64 (#4134) a-n-n-a-l-e-e 2024-01-05 08:04:40 -08:00
  • eec22a1c63 cmake : check for openblas64 (#4134) a-n-n-a-l-e-e 2024-01-05 08:04:40 -08:00
  • f4ee045ad0 flake.nix : fix typo (#4700) Ikko Eltociear Ashimine 2024-01-06 01:02:44 +09:00
  • be36bb946a flake.nix : fix typo (#4700) Ikko Eltociear Ashimine 2024-01-06 01:02:44 +09:00
  • 7e27e37f26 metal : switch back to default.metallib (ggml/681) Georgi Gerganov 2024-01-05 16:30:52 +02:00
  • 91d38876df metal : switch back to default.metallib (ggml/681) Georgi Gerganov 2024-01-05 16:30:52 +02:00
  • d6ec7cfc70 ggml : fix q2_k bpw in comments (ggml/680) Georgi Gerganov 2024-01-05 15:36:04 +02:00
  • d061bf9405 ggml : fix q2_k bpw in comments (ggml/680) Georgi Gerganov 2024-01-05 15:36:04 +02:00
  • 0630261a48 ggml : add error handling to graph_compute (whisper/1714) Finn Voorhees 2024-01-03 08:39:43 -05:00
  • 1bf681f90e ggml : add error handling to graph_compute (whisper/1714) Finn Voorhees 2024-01-03 08:39:43 -05:00
  • 5ffddb870b ggml : do not sched_yield when calling BLAS (#4761) Georgi Gerganov 2024-01-05 15:18:21 +02:00
  • c1d7cb28d3 ggml : do not sched_yield when calling BLAS (#4761) Georgi Gerganov 2024-01-05 15:18:21 +02:00
  • 41ced5ce3c examples : add few-shot translation example (#4783) Georgi Gerganov 2024-01-05 15:11:10 +02:00
  • 3681f22443 examples : add few-shot translation example (#4783) Georgi Gerganov 2024-01-05 15:11:10 +02:00
  • 0c4cb7138c finetune : remove unused includes (#4756) Daniel Bevenius 2024-01-04 20:45:37 +01:00
  • b3a7c20b5c finetune : remove unused includes (#4756) Daniel Bevenius 2024-01-04 20:45:37 +01:00
  • 82e82f484d server : send token probs for "stream == false" (#4714) Georgi Gerganov 2024-01-04 19:56:33 +02:00
  • 012cf349ae server : send token probs for "stream == false" (#4714) Georgi Gerganov 2024-01-04 19:56:33 +02:00
  • b0a9bb90f9 Print backend name on test-backend-ops failure (#4751) Johannes Gäßler 2024-01-04 09:43:23 +01:00
  • a91928014f Print backend name on test-backend-ops failure (#4751) Johannes Gäßler 2024-01-04 09:43:23 +01:00
  • 2d08e99f47 llama.swiftui : support loading custom model from file picker (#4767) singularity 2024-01-04 16:22:38 +08:00
  • 3c0b585561 llama.swiftui : support loading custom model from file picker (#4767) singularity 2024-01-04 16:22:38 +08:00
  • 85648efa9e server : fix options in README.md (#4765) Michael Coppola 2024-01-04 03:17:09 -05:00
  • e5804313a1 server : fix options in README.md (#4765) Michael Coppola 2024-01-04 03:17:09 -05:00
  • 7967c42ffb ggml : include stdlib.h before intrin.h (#4736) Georgi Gerganov 2024-01-04 10:12:26 +02:00
  • dc891b7f7a ggml : include stdlib.h before intrin.h (#4736) Georgi Gerganov 2024-01-04 10:12:26 +02:00
  • c399a87c6b llama.swiftui : fix build of ggml.metallib (#4754) singularity 2024-01-04 15:58:16 +08:00
  • 46cea79e1f llama.swiftui : fix build of ggml.metallib (#4754) singularity 2024-01-04 15:58:16 +08:00
  • 41a287de3c train : fix typo in overlapping-samples help msg (#4758) Daniel Bevenius 2024-01-03 18:53:40 +01:00
  • cb1e2818e0 train : fix typo in overlapping-samples help msg (#4758) Daniel Bevenius 2024-01-03 18:53:40 +01:00
  • 59092ff962 swift : update Package.swift to use ggml as dependency (#4691) Ashraful Islam 2024-01-03 11:30:02 -06:00
  • ece9a45e8f swift : update Package.swift to use ggml as dependency (#4691) Ashraful Islam 2024-01-03 11:30:02 -06:00
  • f2001ff46d cuda : simplify expression Georgi Gerganov 2024-01-03 14:18:46 +02:00
  • 7bed7eba35 cuda : simplify expression Georgi Gerganov 2024-01-03 14:18:46 +02:00
  • 09d890cb54 cuda : mark I16 and I32 ops as unsupported Georgi Gerganov 2024-01-03 13:01:44 +02:00
  • d55356d3ba cuda : mark I16 and I32 ops as unsupported Georgi Gerganov 2024-01-03 13:01:44 +02:00
  • 4ebea0bdce sync : ggml Georgi Gerganov 2024-01-03 11:37:44 +02:00
  • 75e3fd8581 sync : ggml Georgi Gerganov 2024-01-03 11:37:44 +02:00
  • 514561978d metal : add kernel_get_rows_i32 Georgi Gerganov 2024-01-03 11:35:46 +02:00
  • 289313716f metal : add kernel_get_rows_i32 Georgi Gerganov 2024-01-03 11:35:46 +02:00
  • 74b4d9c1ed scripts : fix sync order + metal sed Georgi Gerganov 2024-01-03 11:25:54 +02:00
  • ab62fc3e55 scripts : fix sync order + metal sed Georgi Gerganov 2024-01-03 11:25:54 +02:00
  • b2cfdd2ea3 ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639) Guillaume Wenzek 2023-12-29 18:07:03 +01:00
  • 5f66ebca9c ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639) Guillaume Wenzek 2023-12-29 18:07:03 +01:00
  • 5b56760f5c server : throw an error when slot unavailable (#4741) Justin Parker 2024-01-03 03:43:19 -05:00
  • f2eb19bd8b server : throw an error when slot unavailable (#4741) Justin Parker 2024-01-03 03:43:19 -05:00
  • dc7752d269 metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725) Georgi Gerganov 2024-01-02 21:07:47 +02:00
  • f3f62f0d83 metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725) Georgi Gerganov 2024-01-02 21:07:47 +02:00
  • 421b0da133 server : add token counts to html footer (#4738) Phil H 2024-01-02 15:48:49 +00:00
  • 0ef3ca2ac6 server : add token counts to html footer (#4738) Phil H 2024-01-02 15:48:49 +00:00
  • af83cacf1e llama : llama_model_desc print number of experts Georgi Gerganov 2024-01-02 16:26:45 +02:00
  • 540938f890 llama : llama_model_desc print number of experts Georgi Gerganov 2024-01-02 16:26:45 +02:00
  • 7ea2965198 llama : replace all API facing int's with int32_t (#4577) Marcus Dunn 2024-01-02 06:15:16 -08:00
  • 0040d42eeb llama : replace all API facing int's with int32_t (#4577) Marcus Dunn 2024-01-02 06:15:16 -08:00
  • 1081e7c69c llama : differentiate the KV dims in the attention (#4657) postmasters 2024-01-02 03:51:28 -08:00
  • 83e633c27e llama : differentiate the KV dims in the attention (#4657) postmasters 2024-01-02 03:51:28 -08:00
  • 8243feab46 editorconfig : fix whitespace and indentation #4710 Georgi Gerganov 2024-01-02 13:28:15 +02:00
  • 32866c5edd editorconfig : fix whitespace and indentation #4710 Georgi Gerganov 2024-01-02 13:28:15 +02:00
  • 37b6fbf892 server : add --override-kv parameter (#4710) minarchist 2024-01-02 04:38:15 -06:00
  • 5d7002d437 server : add --override-kv parameter (#4710) minarchist 2024-01-02 04:38:15 -06:00
  • b8646c035d py : re-enable mmap in convert hf (#4732) Nam D. Tran 2024-01-02 16:23:38 +07:00
  • 26f3071d71 py : re-enable mmap in convert hf (#4732) Nam D. Tran 2024-01-02 16:23:38 +07:00
  • ffcf2ca432 finetune: fix typo in README.md (#4733) Daniel Bevenius 2024-01-02 10:16:55 +01:00
  • 775ac8712a finetune: fix typo in README.md (#4733) Daniel Bevenius 2024-01-02 10:16:55 +01:00
  • 64ec26ed76 metal : enable shader debugging (cmake option) (#4705) Georgi Gerganov 2024-01-02 10:57:44 +02:00
  • 58ba655af0 metal : enable shader debugging (cmake option) (#4705) Georgi Gerganov 2024-01-02 10:57:44 +02:00
  • 80f197bec8 flake.lock: update Someone Serge 2023-12-31 17:42:22 +00:00
  • edd1ab7bc3 flake.lock: update Someone Serge 2023-12-31 17:42:22 +00:00
  • f0542c5698 flake.nix: suggest the binary caches Someone Serge 2023-12-30 18:25:25 +00:00
  • 198ed7ebfc flake.nix: suggest the binary caches Someone Serge 2023-12-30 18:25:25 +00:00
  • 5c68d6471c workflows: nix-ci: add a qemu job for jetsons Someone Serge 2023-12-30 18:01:07 +00:00
  • d836174731 workflows: nix-ci: add a qemu job for jetsons Someone Serge 2023-12-30 18:01:07 +00:00
  • a68094145d workflows: nix-flakestry: drop tag filters Someone Serge 2023-12-30 17:36:08 +00:00
  • 06f2a5d190 workflows: nix-flakestry: drop tag filters Someone Serge 2023-12-30 17:36:08 +00:00
  • f44604b11b workflows: weekly nix flake update Someone Serge 2023-12-30 16:38:36 +00:00
  • c5239944ba workflows: weekly nix flake update Someone Serge 2023-12-30 16:38:36 +00:00
  • 0004e10652 workflows: nix-ci: add a job for eval Someone Serge 2023-12-30 17:19:11 +00:00
  • 1e9ae54cf2 workflows: nix-ci: add a job for eval Someone Serge 2023-12-30 17:19:11 +00:00
  • 4be77e38c3 workflows: nix-ci: init; build flake outputs Someone Serge 2023-12-26 19:17:26 +00:00
  • 7adedecbe3 workflows: nix-ci: init; build flake outputs Someone Serge 2023-12-26 19:17:26 +00:00
  • d01cd58a5f flake.nix: expose checks Someone Serge 2023-12-29 16:21:50 +00:00