ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-03-14 15:57:37 +00:00

Files

Kawrakow 5e31a7df43 CUDA: quantized GEMM for for IQ2_KS, IQ2_K, IQ3_K (#418 )

* MMQ for iq2_k

* This works

* MMQ for iq3_k

* MMQ for iq2_ks

* Fix iq2_ks

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-05-15 08:15:08 +03:00

2024-07-27 07:55:01 +02:00

GPU offload policy (#405 )

2025-05-12 07:47:46 +03:00

2025-05-15 08:15:08 +03:00

.gitignore

2024-07-27 07:55:01 +02:00

CMakeLists.txt

2025-03-18 07:37:10 +01:00