Files
ik_llama.cpp/ggml-cuda/common.cuh
Johannes Gäßler 24dfdbb1a3 CUDA: stream-k decomposition for MMQ (#8018)
* CUDA: stream-k decomposition for MMQ

* fix undefined memory reads for small matrices
2024-06-20 14:39:21 +02:00

28 KiB