Commit Graph

  • 6d6d12fc86 q8_k_r8: AVX2 Iwan Kawrakow 2024-12-13 18:55:14 +02:00
  • 93a85c62bb q8_k_r8: fastest matrix multiplication known to human kind Iwan Kawrakow 2024-12-13 18:21:08 +02:00
  • eae584dc98 Faster R4 quants on Zen4 (#139) Kawrakow 2024-12-13 15:47:59 +01:00
  • 12f962dd24 Faster R4 quants on Zen4 (#139) Kawrakow 2024-12-13 15:47:59 +01:00
  • 9fbe417d01 iq4_xs_r4: slightly faster Zen4 ik/r4_faster_zen4 Iwan Kawrakow 2024-12-13 16:04:19 +02:00
  • 0bb4576b74 q5_k_r4: slightly faster Zen4 Iwan Kawrakow 2024-12-13 15:12:35 +02:00
  • ea6c02a79d q4_k_r4: slightly faster Zen4 Iwan Kawrakow 2024-12-13 15:05:12 +02:00
  • 322dd3a366 q6_k_r4: faster Zen4 Iwan Kawrakow 2024-12-13 13:52:49 +02:00
  • c24b578b2c q3_k_r4: faster Zen4 Iwan Kawrakow 2024-12-13 13:36:52 +02:00
  • e3a9ff2130 q3_k_r4: faster Zen4 Iwan Kawrakow 2024-12-13 12:06:59 +02:00
  • 6dd42c1b1c Another fix Kawrakow 2024-12-13 10:14:53 +02:00
  • 36efbfb132 Another fix Iwan Kawrakow 2024-12-13 10:14:53 +02:00
  • 49e10a4bdd Adding lost q4_k_r4 case Kawrakow 2024-12-13 10:09:04 +02:00
  • ff425a3572 Adding lost q4_k_r4 case Iwan Kawrakow 2024-12-13 10:09:04 +02:00
  • ce97b0325e IQ4_K_R4 (#138) Kawrakow 2024-12-12 16:04:20 +01:00
  • 2700d3af36 IQ4_K_R4 (#138) Kawrakow 2024-12-12 16:04:20 +01:00
  • 78c8453847 iq4_k_r4: NEON ik/iq4_k_r4 Iwan Kawrakow 2024-12-12 14:53:29 +01:00
  • 6b505ec4ab iq4_k_r4: AVX2 Iwan Kawrakow 2024-12-12 12:13:23 +02:00
  • ba9a9a1655 iq4_k_r4: Zen4 and hopefully AVX2 Iwan Kawrakow 2024-12-12 11:39:29 +02:00
  • fb79167eec iq4_k_r4: WIP Iwan Kawrakow 2024-12-12 11:00:22 +02:00
  • 66ade83e56 Fix AVX2 implementation of iq4_nl_r4 (#137) Kawrakow 2024-12-11 18:55:21 +01:00
  • aecc95c0ca Fix AVX2 implementation of iq4_nl_r4 (#137) Kawrakow 2024-12-11 18:55:21 +01:00
  • 4e163ba2c8 Fix AVX2 implementation of iq4_nl_r4 ik/fix_avx2_iq4_nl_r4 Iwan Kawrakow 2024-12-11 19:50:05 +02:00
  • 0f6621d410 Q2_K_R4 (#136) Kawrakow 2024-12-11 18:16:49 +01:00
  • 8c6b84220d Q2_K_R4 (#136) Kawrakow 2024-12-11 18:16:49 +01:00
  • b9b9fde8dc Make sure rows per thread are a multiple of 4 ik/q2_k_r4 Iwan Kawrakow 2024-12-11 19:05:33 +02:00
  • ea2de6ee34 q2_k_r4: NEON Iwan Kawrakow 2024-12-11 16:52:33 +01:00
  • 2b07aa3f2e q2_k_r4: AVX2 Iwan Kawrakow 2024-12-11 16:56:51 +02:00
  • 716d793b78 q3_k_r4: AVX2 Iwan Kawrakow 2024-12-11 11:58:40 +02:00
  • 93c6e295ee q2_k_r4: Zen4 Iwan Kawrakow 2024-12-11 16:28:56 +02:00
  • 680d96ad6f Better ARM_NEON implementation for R4 quants (#135) Kawrakow 2024-12-11 14:20:27 +01:00
  • 9469af87f7 Better ARM_NEON implementation for R4 quants (#135) Kawrakow 2024-12-11 14:20:27 +01:00
  • 0ffa542dd2 iq4_xs_r4: Better ARM implementation ik/arm_better_r4 Iwan Kawrakow 2024-12-11 13:42:51 +01:00
  • 04e35a3d51 q4_k_r4: Better ARM implementation Iwan Kawrakow 2024-12-11 13:21:16 +01:00
  • bd3bb0df85 q5_k_r4: Better ARM implementation Iwan Kawrakow 2024-12-11 11:54:32 +01:00
  • 87da25f901 q6_k_r4: Better ARM implementation Iwan Kawrakow 2024-12-11 11:40:27 +01:00
  • 4872f2f57e Q3_K_R4 (#134) Kawrakow 2024-12-11 11:19:00 +01:00
  • e0adb8b122 Q3_K_R4 (#134) Kawrakow 2024-12-11 11:19:00 +01:00
  • da01d165a7 q3_k_r4: AVX2 ik/q3_k_r4 Iwan Kawrakow 2024-12-11 11:58:40 +02:00
  • 7fa670bb06 q3_k_r4: NEON Iwan Kawrakow 2024-12-11 10:32:42 +01:00
  • 93dfde5b62 q3_k_r4: Zen4 works, but not as good as it should be Iwan Kawrakow 2024-12-11 08:01:17 +02:00
  • e78e47b857 Q5_K_R4 (#132) Kawrakow 2024-12-10 18:13:47 +01:00
  • a63a96b5ae Q5_K_R4 (#132) Kawrakow 2024-12-10 18:13:47 +01:00
  • a3ece1661f q5_k_r4: NEON ik/q5_k_r4 Iwan Kawrakow 2024-12-10 17:46:04 +01:00
  • 91af4c6030 q5_k_r4: Zen4 and AVX2 Iwan Kawrakow 2024-12-10 17:23:17 +02:00
  • 1f45d73ae4 q5_k_r4: WIP Iwan Kawrakow 2024-12-10 16:22:52 +02:00
  • 3a8795d422 Slightly faster Q4_K_R4 and IQ4_XS_R4 on Zen4 (#131) Kawrakow 2024-12-10 14:14:40 +01:00
  • c819fa651b Slightly faster Q4_K_R4 and IQ4_XS_R4 on Zen4 (#131) Kawrakow 2024-12-10 14:14:40 +01:00
  • 1e374db2cd iq4_xs_r4: very slightly faster Zen4 ik/q4_k_r4_v3 Iwan Kawrakow 2024-12-10 15:11:17 +02:00
  • 4ca6bb2d4d iq4_k_r4: slightly faster on Zen4 Iwan Kawrakow 2024-12-10 14:38:25 +02:00
  • b7e2f656f5 Q6_K_R4 (#130) Kawrakow 2024-12-10 12:26:40 +01:00
  • 361174ee6a Q6_K_R4 (#130) Kawrakow 2024-12-10 12:26:40 +01:00
  • 408178aa8a q6_k_r4: slightly faster Zen4 ik/q6_k_r4 Iwan Kawrakow 2024-12-10 12:43:20 +02:00
  • 8de44659ca q6_k_r4: slightly faster NEON Iwan Kawrakow 2024-12-10 10:14:24 +01:00
  • 99d79116ee q6_k_r4: 1st NEON version Iwan Kawrakow 2024-12-10 08:32:24 +01:00
  • a5db08118a q6_k_r4: AVX2 and simple Zen4 Iwan Kawrakow 2024-12-10 08:08:10 +02:00
  • 2bd2d0176a q6_k_r4: 1st functional AVX2 version Iwan Kawrakow 2024-12-09 20:02:57 +02:00
  • 2dce0267c9 Adding q6_k_r4 Iwan Kawrakow 2024-12-09 19:16:19 +02:00
  • 13126ce100 Q4_K_R4 (#129) Kawrakow 2024-12-09 16:59:18 +01:00
  • 3ec193b485 Q4_K_R4 (#129) Kawrakow 2024-12-09 16:59:18 +01:00
  • 1319702527 Minor ik/q4_k_r4_v2 Iwan Kawrakow 2024-12-09 17:21:12 +02:00
  • a515388119 Minor Iwan Kawrakow 2024-12-09 16:26:24 +02:00
  • 9b475867f1 q4_k_r4: slightly better AVX2 Iwan Kawrakow 2024-12-09 15:50:27 +02:00
  • 0fb93682f9 q4_k_r4: NEON Iwan Kawrakow 2024-12-09 12:21:24 +01:00
  • 666a2e2c92 q4_k_r4: AVX2 Iwan Kawrakow 2024-12-09 12:13:38 +02:00
  • 26d677483e q4_k_r4: finally works on Zen4 Iwan Kawrakow 2024-12-09 11:45:35 +02:00
  • 732cc7e879 Simply don't see what is wrong Iwan Kawrakow 2024-12-09 11:06:25 +02:00
  • 24d3bf2e70 Something is still wrong Iwan Kawrakow 2024-12-08 19:48:15 +02:00
  • b39bbb0405 Faster IQ4_XS_R4 on Zen4 (#128) Kawrakow 2024-12-08 15:27:13 +01:00
  • 43e65a672a Faster IQ4_XS_R4 on Zen4 (#128) Kawrakow 2024-12-08 15:27:13 +01:00
  • 4dc97b187b Fix broken matrix x vector product on Zen4 ik/zen4_iq4_xs_r4 Iwan Kawrakow 2024-12-08 16:23:41 +02:00
  • 5de1cf4885 Faster iq4_xs_r4 on Zen4 Iwan Kawrakow 2024-12-08 15:44:49 +02:00
  • daf5f52022 Rename iq4_nl_x4 to iq4_nl_r4 (#126) Kawrakow 2024-12-08 09:34:42 +01:00
  • fc701cedd1 Rename iq4_nl_x4 to iq4_nl_r4 (#126) Kawrakow 2024-12-08 09:34:42 +01:00
  • 7b9c76b82d Rename iq4_nl_x4 to iq4_nl_r4 ik/rename_iq4_nl_x4 Iwan Kawrakow 2024-12-08 10:26:06 +02:00
  • cc9acdbcff R4 improvements on ARM_NEON (#125) Kawrakow 2024-12-08 09:13:10 +01:00
  • ef95b81733 R4 improvements on ARM_NEON (#125) Kawrakow 2024-12-08 09:13:10 +01:00
  • 6a6a90f1a1 Minor iq4_xs_r4 improvement on NEON ik/r4_neon Iwan Kawrakow 2024-12-08 09:01:34 +01:00
  • 9e7644d08a Simplify Iwan Kawrakow 2024-12-08 06:06:22 +01:00
  • d34510d2f3 Apply qx_0_r4_q8_0 template also to q6_0_r4 and iq4_nl_x4 Iwan Kawrakow 2024-12-07 19:50:28 +01:00
  • df139c5649 qx_0_r4_q8_0 template Iwan Kawrakow 2024-12-07 18:47:37 +01:00
  • 12d3ea1e30 q4_0_r4: 6% faster PP on NEON Iwan Kawrakow 2024-12-07 16:01:54 +01:00
  • 86adfa334b iq4_k_r4: WIP, nothing works ik/q4_k_r4 Iwan Kawrakow 2024-12-07 10:55:38 +02:00
  • 612a207676 iq2_bn_r4: fastest Bitnet CPU implementation on the planet (#124) Kawrakow 2024-12-06 12:15:39 +01:00
  • 3682e4700d iq2_bn_r4: fastest Bitnet CPU implementation on the planet (#124) Kawrakow 2024-12-06 12:15:39 +01:00
  • c247793798 iq2_bn_r4: better AVX2 ik/iq2_bn_r4 Iwan Kawrakow 2024-12-06 10:26:02 +02:00
  • 7edc3a4a84 iq2_bn_r4: fix AVX2 after breaking it two commits ago Iwan Kawrakow 2024-12-06 09:27:14 +02:00
  • f8a651c38c iq2_bn_r4: simdify q8_K16 quantization (NEON) Iwan Kawrakow 2024-12-06 08:12:32 +01:00
  • e06c83c8ee iq2_bn_r4: simdify q8_K16 quantization (AVX2) Iwan Kawrakow 2024-12-06 08:41:54 +02:00
  • 4d730ebfd9 iq2_bn_r4: use AVX2 implementation on Zen4 for matrix x vector Iwan Kawrakow 2024-12-06 07:12:16 +02:00
  • 6a12170f59 iq2_bn_r4: AVX2 Iwan Kawrakow 2024-12-06 06:51:46 +02:00
  • 99cd209c7b Some cleanup Iwan Kawrakow 2024-12-05 19:54:53 +01:00
  • b02fd4ee38 iq2_bn_r4: Experimenting on NEON Iwan Kawrakow 2024-12-05 18:52:38 +01:00
  • c848533580 iq2_bn_r4: NEON Iwan Kawrakow 2024-12-05 17:40:12 +01:00
  • 32f8a33f5e iq2_bn_r4: 1st shot at NEON Iwan Kawrakow 2024-12-05 15:21:39 +01:00
  • 57c58ff75b Make sure rows per thread are a multiple of the number of interleaved rows Iwan Kawrakow 2024-12-05 15:36:17 +02:00
  • 0137264c6f Adding iq2_bn_r4 Iwan Kawrakow 2024-12-05 15:18:33 +02:00
  • 9119023a4b IQ4_XS_R4 (#123) Kawrakow 2024-12-04 15:20:07 +01:00
  • f64de08203 IQ4_XS_R4 (#123) Kawrakow 2024-12-04 15:20:07 +01:00
  • 6dec39627c DRY ik/iq4_xs_r4 Iwan Kawrakow 2024-12-04 15:05:55 +01:00