Iwan Kawrakow
595d2ae32d
iq6_k: slightly better Zen4 iqk_mul_mat
...
We now arrive at pp-512 = 147 t/s for LLaMA-3.1-8B.
TG-128 is 9.5 t/s. This is better than last commit,
but still kind of slow compared to Q6_K.
My last commit message is wrong: also iq3_k needs a fix
for overflow.
2024-08-09 16:00:31 +02:00
..
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-01 09:38:06 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-07 07:56:09 +02:00
2024-08-07 07:56:09 +02:00
2024-08-09 16:00:31 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-08-09 16:00:31 +02:00