Files
ik_llama.cpp/ggml
Iwan Kawrakow 464b8fc03b CUDA: add head size of 64 to new mma
Haven't turned it on yet, but observe slightly better PP and slightly
worse TG performance with that.
2025-08-11 11:10:45 +03:00
..
2024-07-27 07:55:01 +02:00
2025-08-11 11:10:45 +03:00
2024-07-27 07:55:01 +02:00