Files
ik_llama.cpp/ggml
Kawrakow 5e31a7df43 CUDA: quantized GEMM for for IQ2_KS, IQ2_K, IQ3_K (#418)
* MMQ for iq2_k

* This works

* MMQ for iq3_k

* MMQ for iq2_ks

* Fix iq2_ks

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2025-05-15 08:15:08 +03:00
..
2024-07-27 07:55:01 +02:00
2025-05-12 07:47:46 +03:00
2024-07-27 07:55:01 +02:00