Kawrakow
ef16135920
Bitnet: trying an alternative iq1_bn grid
...
Faster on CUDA. The scalar version is faster too.
The issue with CUDA is that now I see wild performance
fluctuations. Running llama-bench I can get 220 t/s
for TG-128 one time, and 190 t/s another time, with
uncertaintiers of 1-2 t/s. Same for PP, results are
jumping back-and-fort between ~9500 t/s and ~8900 t/s.
So, basically no reliable measurement at this point,
but for sure faster than the previous version, which was
at around 170-180 t/s.
2024-06-25 11:32:48 +03:00
..
2024-06-05 16:53:00 +02:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-06-14 18:41:49 +02:00
2024-03-25 13:50:23 +01:00
2024-06-22 12:02:52 +03:00
2024-03-25 13:50:23 +01:00
2024-05-08 22:55:49 +02:00
2024-03-25 13:50:23 +01:00
2024-06-22 12:02:52 +03:00
2024-05-29 15:38:26 +03:00
2024-03-25 13:50:23 +01:00
2024-06-25 11:32:48 +03:00
2024-04-09 11:16:13 +03:00
2024-05-08 22:55:49 +02:00
2024-05-08 22:55:49 +02:00
2024-04-09 11:16:13 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-06-05 16:53:00 +02:00
2024-03-29 17:45:46 +02:00
2024-06-10 11:45:13 +02:00
2024-06-10 11:45:13 +02:00
2024-05-17 18:54:52 +02:00
2024-06-01 15:47:04 +02:00
2024-05-17 18:54:52 +02:00
2024-06-10 11:45:13 +02:00
2024-06-12 17:41:51 +02:00
2024-06-10 11:45:13 +02:00
2024-06-01 15:47:04 +02:00
2024-04-30 12:16:08 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-06-11 08:26:07 +02:00
2024-06-20 14:39:21 +02:00
2024-06-20 14:39:21 +02:00
2024-06-22 12:02:52 +03:00
2024-03-25 13:50:23 +01:00
2024-05-29 20:17:31 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-06-09 09:42:25 +02:00
2024-06-09 09:42:25 +02:00
2024-06-05 11:29:20 +03:00
2024-03-25 13:50:23 +01:00
2024-05-08 22:55:49 +02:00
2024-03-25 13:50:23 +01:00
2024-06-14 18:41:49 +02:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-06-17 00:23:04 +02:00
2024-06-17 00:23:04 +02:00
2024-05-15 13:23:33 +03:00
2024-03-25 13:50:23 +01:00
2024-06-25 11:32:48 +03:00