Files
ik_llama.cpp/ggml/src
Iwan Kawrakow 57c58ff75b Make sure rows per thread are a multiple of the number of interleaved rows
With this I can run iq2_bn_r4 with 32 threads and this increases
PP-512 to 872 t/s.
2024-12-05 15:36:17 +02:00
..
2024-11-21 07:12:11 +01:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-10-25 13:08:43 +02:00
2024-12-04 15:20:07 +01:00
2024-10-31 12:05:27 +01:00
2024-08-12 15:14:32 +02:00
2024-10-31 12:05:27 +01:00
2024-10-31 12:05:27 +01:00
2024-12-05 15:18:33 +02:00
2024-12-05 15:18:33 +02:00
2024-12-05 15:18:33 +02:00