Commit Graph

  • 2d4fee2312 Remove the no longer used iq1bn_grid_u16 Kawrakow 2024-07-17 10:16:50 +03:00
  • febb8bbea0 Remove the no longer used iq1bn_grid_u16 Iwan Kawrakow 2024-07-17 10:16:50 +03:00
  • 0194639b6b iq1bn: adjust scalar dot product and some cleanup Kawrakow 2024-07-17 08:44:46 +02:00
  • ba00f23ea1 iq1bn: adjust scalar dot product and some cleanup Iwan Kawrakow 2024-07-17 08:44:46 +02:00
  • 2881bdf220 iq1bn(no lookup): better version Kawrakow 2024-07-17 08:54:11 +03:00
  • 873a790ee2 iq1bn(no lookup): better version Iwan Kawrakow 2024-07-17 08:54:11 +03:00
  • d84748b71b iq1bn(no lookup): Metal Kawrakow 2024-07-16 09:12:15 +02:00
  • 52a25e307c iq1bn(no lookup): Metal Iwan Kawrakow 2024-07-16 09:12:15 +02:00
  • d0f9d146b8 iq1bn(no lookup): NEON attempts Kawrakow 2024-07-16 08:32:15 +02:00
  • 6393e26827 iq1bn(no lookup): NEON attempts Iwan Kawrakow 2024-07-16 08:32:15 +02:00
  • 597ea12970 iq1bn(no lookup): NEON Kawrakow 2024-07-15 20:40:14 +02:00
  • 26a1a689c6 iq1bn(no lookup): NEON Iwan Kawrakow 2024-07-15 20:40:14 +02:00
  • cd8fffc3cd iq1bn(no lookup): CUDA Kawrakow 2024-07-15 19:56:51 +03:00
  • ef39ca6a2c iq1bn(no lookup): CUDA Iwan Kawrakow 2024-07-15 19:56:51 +03:00
  • 1f3dbbcc19 iq1bn(no lookup): somewhat better Kawrakow 2024-07-15 13:46:07 +03:00
  • e4dc3babb5 iq1bn(no lookup): somewhat better Iwan Kawrakow 2024-07-15 13:46:07 +03:00
  • 98be184c23 iq1bn: attempt without a lookup table Kawrakow 2024-07-15 11:02:41 +03:00
  • a4bbd36905 iq1bn: attempt without a lookup table Iwan Kawrakow 2024-07-15 11:02:41 +03:00
  • 43f4c58376 Remove all workflows Kawrakow 2024-06-27 09:45:56 +03:00
  • 01397535b3 Remove all workflows Iwan Kawrakow 2024-06-27 09:45:56 +03:00
  • aaec3c1f60 imatrix: be able to specify the name of the output tensor Kawrakow 2024-06-26 17:38:18 +03:00
  • 0a3a2c4cd4 imatrix: be able to specify the name of the output tensor Iwan Kawrakow 2024-06-26 17:38:18 +03:00
  • be36ca872f bitnet: fold V scale into rms_norm Kawrakow 2024-06-26 12:05:57 +02:00
  • 71725a918f bitnet: fold V scale into rms_norm Iwan Kawrakow 2024-06-26 12:05:57 +02:00
  • 6467358fd4 RoPE(Neox, Metal): don't use power functions in a loop Kawrakow 2024-06-26 11:22:47 +02:00
  • 641dd6bc68 RoPE(Neox, Metal): don't use power functions in a loop Iwan Kawrakow 2024-06-26 11:22:47 +02:00
  • d280bf30c4 Typo Kawrakow 2024-06-25 19:17:14 +03:00
  • 767bce7caf Typo Iwan Kawrakow 2024-06-25 19:17:14 +03:00
  • 9918542658 bitnet: remove iq1_bn lookup table storing +/- signs Kawrakow 2024-06-25 18:19:11 +03:00
  • 753dbaeeb0 bitnet: remove iq1_bn lookup table storing +/- signs Iwan Kawrakow 2024-06-25 18:19:11 +03:00
  • 12e97f1f1f bitnet: simdify q8_K64 quantization on AVX Kawrakow 2024-06-25 17:20:34 +03:00
  • 8b436a84c5 bitnet: simdify q8_K64 quantization on AVX Iwan Kawrakow 2024-06-25 17:20:34 +03:00
  • cb12b6f253 bitnet: NEON improvements for iq1_bn Kawrakow 2024-06-25 13:48:29 +02:00
  • c906c4c4fe bitnet: NEON improvements for iq1_bn Iwan Kawrakow 2024-06-25 13:48:29 +02:00
  • 636dbd03c5 bitnet: remove the now unused iq1bn_grid_u16 Kawrakow 2024-06-25 12:41:43 +02:00
  • 49bacf2288 bitnet: remove the now unused iq1bn_grid_u16 Iwan Kawrakow 2024-06-25 12:41:43 +02:00
  • cd2f60c89a Bitnet: adapt NEON and Metal to the alternative grid Kawrakow 2024-06-25 11:16:13 +02:00
  • 7de9559cf2 Bitnet: adapt NEON and Metal to the alternative grid Iwan Kawrakow 2024-06-25 11:16:13 +02:00
  • ef16135920 Bitnet: trying an alternative iq1_bn grid Kawrakow 2024-06-25 11:32:48 +03:00
  • aa14a06b44 Bitnet: trying an alternative iq1_bn grid Iwan Kawrakow 2024-06-25 11:32:48 +03:00
  • 90a6071a93 bitnet: fix scalar dot product for 1.625 bpw Kawrakow 2024-06-25 08:31:12 +02:00
  • cc44d4a5c3 bitnet: fix scalar dot product for 1.625 bpw Iwan Kawrakow 2024-06-25 08:31:12 +02:00
  • ee6565fdeb Bitnet: slightly faster 1.625 bpw variant for AVX512VL Kawrakow 2024-06-25 08:33:00 +03:00
  • 3d61866f0a Bitnet: slightly faster 1.625 bpw variant for AVX512VL Iwan Kawrakow 2024-06-25 08:33:00 +03:00
  • 8542b4f359 Bitnet: tiny bity faster 1.625 bpw variant on Metal Kawrakow 2024-06-24 16:42:30 +02:00
  • 707d087927 Bitnet: tiny bity faster 1.625 bpw variant on Metal Iwan Kawrakow 2024-06-24 16:42:30 +02:00
  • f2a82090df Adding add_4, mul_4, div_4 kernels to Metal Kawrakow 2024-06-24 10:22:10 +02:00
  • 49822f84a9 Adding add_4, mul_4, div_4 kernels to Metal Iwan Kawrakow 2024-06-24 10:22:10 +02:00
  • c9ddaf2fa3 bitnet: qnfs tests Kawrakow 2024-06-22 11:44:00 +03:00
  • b747093582 bitnet: qnfs tests Iwan Kawrakow 2024-06-22 11:44:00 +03:00
  • b1fb7df6a5 bitnet: replace ggml_mul with ggml_scale to apply the scales Kawrakow 2024-06-22 10:18:41 +03:00
  • 8c936e3d65 bitnet: replace ggml_mul with ggml_scale to apply the scales Iwan Kawrakow 2024-06-22 10:18:41 +03:00
  • 0fe0d54be6 iqk_mul_mat: add IQ4_NL also on NEON Kawrakow 2024-06-21 18:30:01 +02:00
  • fc04994ebf iqk_mul_mat: add IQ4_NL also on NEON Iwan Kawrakow 2024-06-21 18:30:01 +02:00
  • 32ec107237 iqk_mul_mat: add IQ4_NL Kawrakow 2024-06-21 18:51:44 +03:00
  • caa42ccc56 iqk_mul_mat: add IQ4_NL Iwan Kawrakow 2024-06-21 18:51:44 +03:00
  • 912d6d9ce1 bitnet(scale in a separate tensor): CPU tweaks Kawrakow 2024-06-21 18:18:23 +03:00
  • 86dc8e5f8b bitnet(scale in a separate tensor): CPU tweaks Iwan Kawrakow 2024-06-21 18:18:23 +03:00
  • f53d89dd53 bitnet(scale in a separate tensor): CPU tweaks Kawrakow 2024-06-20 19:23:10 +03:00
  • 729ba46f77 bitnet(scale in a separate tensor): CPU tweaks Iwan Kawrakow 2024-06-20 19:23:10 +03:00
  • 52ad5764dd bitnet(scale in a separate tensor): more CPU improvements Kawrakow 2024-06-20 18:39:31 +03:00
  • f0325c5826 bitnet(scale in a separate tensor): more CPU improvements Iwan Kawrakow 2024-06-20 18:39:31 +03:00
  • 167489ef6c bitnet(scale in a separate tensor): CPU improvements Kawrakow 2024-06-20 15:20:50 +03:00
  • e05cca9ef6 bitnet(scale in a separate tensor): CPU improvements Iwan Kawrakow 2024-06-20 15:20:50 +03:00
  • 8b31c14e0d bitnet(scale in a separate tensor): mul -> scale on the CPU Kawrakow 2024-06-20 08:21:25 +03:00
  • 36374ab37d bitnet(scale in a separate tensor): mul -> scale on the CPU Iwan Kawrakow 2024-06-20 08:21:25 +03:00
  • e423af855f bitnet(scale in a separate tensor): mul -> scale on CUDA Kawrakow 2024-06-19 19:51:39 +03:00
  • e73ae1f6d3 bitnet(scale in a separate tensor): mul -> scale on CUDA Iwan Kawrakow 2024-06-19 19:51:39 +03:00
  • f72db4769b bitnet(scale in a separate tensor): mul -> scale on Metal Kawrakow 2024-06-19 18:23:57 +02:00
  • 7f968d51b4 bitnet(scale in a separate tensor): mul -> scale on Metal Iwan Kawrakow 2024-06-19 18:23:57 +02:00
  • 30fc9b5753 Revert "bitnet(scale in a separate tensor): replace ggml_mul with ggml_scale" Kawrakow 2024-06-19 18:51:41 +03:00
  • d08ff0df43 Revert "bitnet(scale in a separate tensor): replace ggml_mul with ggml_scale" Iwan Kawrakow 2024-06-19 18:51:41 +03:00
  • f024804b9a bitnet(scale in a separate tensor): replace ggml_mul with ggml_scale Kawrakow 2024-06-19 17:14:42 +02:00
  • ad60fb3567 bitnet(scale in a separate tensor): replace ggml_mul with ggml_scale Iwan Kawrakow 2024-06-19 17:14:42 +02:00
  • 3c5cd34a05 bitnet(scale in a separate tensor): Metal Kawrakow 2024-06-19 16:46:51 +02:00
  • 257fa74014 bitnet(scale in a separate tensor): Metal Iwan Kawrakow 2024-06-19 16:46:51 +02:00
  • 14081ee2ef bitnet(scale in a separate tensor): CUDA Kawrakow 2024-06-19 17:09:13 +03:00
  • a2e43b83c9 bitnet(scale in a separate tensor): CUDA Iwan Kawrakow 2024-06-19 17:09:13 +03:00
  • 785cac7ee5 bitnet: put the scale in a separate tensor Kawrakow 2024-06-19 16:46:23 +03:00
  • 58d9e8f1d2 bitnet: put the scale in a separate tensor Iwan Kawrakow 2024-06-19 16:46:23 +03:00
  • 1f9541172f Bitnet(1.75 bpw): higher precision fp8 scale Kawrakow 2024-06-18 20:08:28 +03:00
  • 927e251a12 Bitnet(1.75 bpw): higher precision fp8 scale Iwan Kawrakow 2024-06-18 20:08:28 +03:00
  • 9d38a61be7 Bitnet(1.75 bpw): slightly faster CUDA dot product Kawrakow 2024-06-18 18:42:26 +03:00
  • 181fd9c56e Bitnet(1.75 bpw): slightly faster CUDA dot product Iwan Kawrakow 2024-06-18 18:42:26 +03:00
  • f6bfdce911 Bitnet(2.25 bpw): faster Metal dot product Kawrakow 2024-06-18 13:42:42 +02:00
  • fece7e1db7 Bitnet(2.25 bpw): faster Metal dot product Iwan Kawrakow 2024-06-18 13:42:42 +02:00
  • f200d36a7f Bitnet(2.25 bpw): Metal Kawrakow 2024-06-18 13:32:51 +02:00
  • 4f51348d3d Bitnet(2.25 bpw): Metal Iwan Kawrakow 2024-06-18 13:32:51 +02:00
  • ff718c2dc1 Bitnet(2.25 bpw): CUDA Kawrakow 2024-06-18 13:56:16 +03:00
  • 01ea9a862d Bitnet(2.25 bpw): CUDA Iwan Kawrakow 2024-06-18 13:56:16 +03:00
  • 766975ecfa Bitnet(2.25 bpw): NEON Kawrakow 2024-06-18 11:11:46 +02:00
  • 2998ca9b14 Bitnet(2.25 bpw): NEON Iwan Kawrakow 2024-06-18 11:11:46 +02:00
  • 39982764d7 Bitnet: 2.25 bpw version Kawrakow 2024-06-18 12:00:16 +03:00
  • 8c6276f6a1 Bitnet: 2.25 bpw version Iwan Kawrakow 2024-06-18 12:00:16 +03:00
  • 68741281e5 bitnet 2 bpw: NEON implementation Kawrakow 2024-06-17 18:51:00 +02:00
  • 1de6476d75 bitnet 2 bpw: NEON implementation Iwan Kawrakow 2024-06-17 18:51:00 +02:00
  • a8521b73d7 Removed extra column Kawrakow 2024-06-17 19:19:25 +03:00
  • f97a329638 Removed extra column Iwan Kawrakow 2024-06-17 19:19:25 +03:00
  • 8ca1bdebe4 bitnet 2 bpw: AVX2 implementation Kawrakow 2024-06-17 19:07:38 +03:00
  • 6616985135 bitnet 2 bpw: AVX2 implementation Iwan Kawrakow 2024-06-17 19:07:38 +03:00